Deadlock debugging help.

Ben Greear greearb at candelatech.com
Tue Feb 27 15:31:59 PST 2018


On 02/27/2018 01:42 PM, Ben Greear wrote:
> On 02/27/2018 12:49 PM, Ben Greear wrote:
>> I notice I can reliably lock up the kernel if I rmmod ath10k while it is under
>> heavy tx/rx traffic.  First, this causes the firmware to crash, and then right
>> after (or possibly during?) the related kernel threads deadlock.
>>
>> This is with my hacked driver and hacked firmware.  In particular, the
>> ath10k_debug_nop_dwork is something I added, though it is pretty trivial,
>> it does take the ar->conf_mutex.  It appears blocked trying to get it.
>>
>> It appears something is holding the ar->conf_mutex, but it is not clear to
>> me from the lockdep output what process actually holds it.
>> Anyone see a clue they could share?
>
> Changing how I start/stop the nop_dwork stuff seems to have made the
> problem go away, so I guess maybe that was the issue.

Ok, so problem still remains.  The 'rmmod' process appears to be the
one that is really not making progress.  Unfortunately, decoding
ath10k_pci_hif_stop+0x6f leads to some bitops.h inline, which doesn't
let me know where it is actually stuck...  Off to do more debugging....


[ 4037.220992] rmmod           D    0 20267   3050 0x00000080
[ 4037.220995] Call Trace:
[ 4037.220997]  __schedule+0x407/0xb70
[ 4037.220999]  ? _raw_spin_unlock_irqrestore+0x4e/0x70
[ 4037.221003]  schedule+0x38/0x90
[ 4037.221005]  schedule_timeout+0x224/0x580
[ 4037.221007]  ? retint_kernel+0x2d/0x2d
[ 4037.221010]  ? call_timer_fn+0x370/0x370
[ 4037.221015]  msleep+0x34/0x40
[ 4037.221017]  ? msleep+0x34/0x40
[ 4037.221021]  ath10k_pci_hif_stop+0x6f/0xd0 [ath10k_pci]
[ 4037.221032]  ath10k_core_stop+0x4d/0x90 [ath10k_core]
[ 4037.221038]  ath10k_halt+0x14b/0x1f0 [ath10k_core]
[ 4037.221044]  ath10k_stop+0x36/0x80 [ath10k_core]
[ 4037.221059]  drv_stop+0x58/0x2d0 [mac80211]
[ 4037.221075]  ieee80211_stop_device+0x3e/0x50 [mac80211]
[ 4037.221088]  ieee80211_do_stop+0x501/0x880 [mac80211]
[ 4037.221092]  ? dev_deactivate_many+0x2b2/0x2f0
[ 4037.221105]  ieee80211_stop+0x15/0x20 [mac80211]
[ 4037.221107]  __dev_close_many+0x93/0xe0
[ 4037.221110]  dev_close_many+0x7d/0x120
[ 4037.221114]  dev_close.part.85+0x36/0x50
[ 4037.221116]  dev_close+0x15/0x20
[ 4037.221155]  cfg80211_shutdown_all_interfaces+0x44/0xc0 [cfg80211]
[ 4037.221168]  ieee80211_remove_interfaces+0x42/0x1c0 [mac80211]
[ 4037.221180]  ieee80211_unregister_hw+0x45/0x130 [mac80211]
[ 4037.221187]  ath10k_mac_unregister+0x14/0x60 [ath10k_core]
[ 4037.221193]  ath10k_core_unregister+0x3a/0xa0 [ath10k_core]
[ 4037.221197]  ath10k_pci_remove+0x2d/0x70 [ath10k_pci]
[ 4037.221200]  pci_device_remove+0x34/0xb0
[ 4037.221203]  device_release_driver_internal+0x158/0x210
[ 4037.221206]  driver_detach+0x3b/0x80
[ 4037.221208]  bus_remove_driver+0x53/0xd0
[ 4037.221210]  driver_unregister+0x27/0x40
[ 4037.221213]  pci_unregister_driver+0x24/0x90
[ 4037.221216]  ath10k_pci_exit+0x10/0x6ee [ath10k_pci]
[ 4037.221218]  SyS_delete_module+0x1e1/0x2a0
[ 4037.221222]  do_syscall_64+0x64/0x140
[ 4037.221225]  entry_SYSCALL64_slow_path+0x25/0x25

Thanks,
Ben


-- 
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com




More information about the ath10k mailing list