ath10k: kernel panic on 8f975e16cd2b456ee40a796e9c61797b58670706

Michal Kazior michal.kazior at tieto.com
Tue Sep 2 06:48:42 PDT 2014


Hi,

While playing with some stuff I managed to crash the device very early
during wait_for_device and I've hit the following:

[  171.259328] ath10k_pci 0000:00:05.0: device has crashed during init
[  171.264129] BUG: unable to handle kernel NULL pointer dereference
at           (null)
[  171.265095] IP: [<ffffffffa0058005>]
ath10k_debug_get_new_fw_crash_data+0x15/0x30 [ath10k_core]
[  171.265095] PGD 0
[  171.265095] Oops: 0002 [#1] SMP
[  171.265095] Modules linked in: ath10k_pci(O) ath10k_core(O) ath
[last unloaded: ath]
[  171.265095] CPU: 3 PID: 29 Comm: kworker/u8:1 Tainted: G
O   3.17.0-rc2-wl-ath+ #447
[  171.265095] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[  171.265095] Workqueue: ath10k_wq ath10k_core_register_work [ath10k_core]
[  171.265095] task: ffff88001eb01ad0 ti: ffff88001eb60000 task.ti:
ffff88001eb60000
[  171.265095] RIP: 0010:[<ffffffffa0058005>]  [<ffffffffa0058005>]
ath10k_debug_get_new_fw_crash_data+0x15/0x30 [ath10k_core]
[  171.265095] RSP: 0018:ffff88001eb63ce8  EFLAGS: 00010246
[  171.265095] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  171.265095] RDX: 0000000000000000 RSI: ffffc90001a09030 RDI: 0000000000000001
[  171.265095] RBP: ffff88001eb63cf0 R08: 0000000000000000 R09: ffff8800000bb200
[  171.265095] R10: 00000000000001e2 R11: ffff88001eb638de R12: ffff88001d7459a0
[  171.265095] R13: ffff88001d746ab0 R14: 00000000fffe14d4 R15: ffff88001d747c60
[  171.265095] FS:  0000000000000000(0000) GS:ffff88001fd80000(0000)
knlGS:0000000000000000
[  171.265095] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  171.265095] CR2: 0000000000000000 CR3: 000000001df34000 CR4: 00000000000006e0
[  171.265095] Stack:
[  171.265095]  ffff88001d7459a0 ffff88001eb63d58 ffffffffa0083bbe
ffff880000000010
[  171.265095]  ffff88001eb63d68 ffff88001eb63d18 0000000000000002
0000000000059010
[  171.265095]  ffffffffa0086fef 00000000deadbeef ffff88001d747a28
ffff88001d7459a0
[  171.265095] Call Trace:
[  171.265095]  [<ffffffffa0083bbe>]
ath10k_pci_fw_crashed_dump+0x2e/0xd0 [ath10k_pci]
[  171.265095]  [<ffffffffa0085410>]
__ath10k_pci_hif_power_up+0x5f0/0x700 [ath10k_pci]
[  171.265095]  [<ffffffffa0085550>] ath10k_pci_hif_power_up+0x30/0xe0
[ath10k_pci]
[  171.265095]  [<ffffffffa005bc7b>]
ath10k_core_register_work+0x2b/0x520 [ath10k_core]
[  171.265095]  [<ffffffff810689cc>] process_one_work+0x18c/0x3f0
[  171.265095]  [<ffffffff81069011>] worker_thread+0x121/0x4a0
[  171.265095]  [<ffffffff81068ef0>] ? rescuer_thread+0x2c0/0x2c0
[  171.265095]  [<ffffffff8106daf2>] kthread+0xd2/0xf0
[  171.265095]  [<ffffffff8106da20>] ? kthread_create_on_node+0x170/0x170
[  171.265095]  [<ffffffff81857cfc>] ret_from_fork+0x7c/0xb0
[  171.265095]  [<ffffffff8106da20>] ? kthread_create_on_node+0x170/0x170
[  171.265095] Code: 8b 40 38 48 c7 80 00 01 00 00 00 00 00 00 5b 5d
c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 8b 9f 90 1d 00 00
48 8d 7b 01 <c6> 03 01 e8 e3 ec 2b e1 48 8d 7b 18 e8 6a 4f 05 e1 48 89
d8 5b
[  171.265095] RIP  [<ffffffffa0058005>]
ath10k_debug_get_new_fw_crash_data+0x15/0x30 [ath10k_core]
[  171.265095]  RSP <ffff88001eb63ce8>
[  171.265095] CR2: 0000000000000000
[  171.265095] ---[ end trace 5d0ed15b050bcc1f ]---
[  171.265095] Kernel panic - not syncing: Fatal exception in interrupt
[  171.265095] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffff9fffffff)
[  171.265095] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

The reason is ath10k_debug_create() is called during core_register
*after* probe_fw. If firmware crashes during probe_fw (and I suppose
it doesn't need to be the "early" one) then ath10k ends up
dereferencing crash_data (which is NULL).

So we either add an extra check for the NULL case or we re-work debug
lifecycle. Currently debug attaches to mac80211/wiphy debugfs so part
of it has to be done after probing fw anyway (crash_data could be done
earlier). Thoughts?


Michał



More information about the ath10k mailing list