Deadlock on (faked) firmware crash, CUS239, modified 10.4.3 firmware.

Ben Greear greearb at candelatech.com
Tue Mar 29 08:46:12 PDT 2016


On 03/29/2016 01:14 AM, Michal Kazior wrote:
> On 26 March 2016 at 03:27, Ben Greear <greearb at candelatech.com> wrote:
>> I've been seeing this for a while now.  When firmware crashes, often the OS
>> at least
>> partially locks up.
>>
>> This is modified 4.4.6 driver/kernel, modified 10.4.3 firmware.  I had 35
>> stations associated,
>> and reset one.  Flush fails (maybe because nothing stops tx on other vdevs
>> while flushing one?)
>> and I added a fake firmware crash even in case flush fails.
>>
>> Then, I get deadlock.  I've seen other similar deadlocks when the firmware
>> crashed due
>> to 'natural' causes when adding vdevs....
>>
>> Looks like the same process is not actually stuck in one place...each time
>> the kernel splats,
>> it is in a different place..spinning and spinning.  Maybe it needs a
>> bail-out on firmware
>> crash?
> [...]
>> [  316.477677] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s!
>> [kworker/u8:3:257]
>> [  316.477720] Modules linked in: nf_conntrack_netlink nf_conntrack
>> nfnetlink nf_defrag_ipv4 8021q garp mrp stp llc bnep bluetooth fuse macvlan
>> wanlink(O) pktgen rpcsec_gss_krb5 nfsv4 nfs fscache iTCO_wdt
>> iTCO_vendor_support coretemp ath9k ath10k_pci hwmon ath9k_common ath10k_core
>> ath9k_hw intel_rapl iosf_mbi ath x86_pkg_temp_thermal intel_powerclamp
>> mac80211 kvm_intel kvm joydev irqbypass pcspkr serio_raw cfg80211
>> snd_hda_codec_hdmi lpc_ich i2c_i801 snd_hda_codec_realtek
>> snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep
>> snd_seq snd_seq_device snd_pcm 8250_fintek snd_timer snd shpchp soundcore
>> tpm_tis tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ata_generic
>> pata_acpi i915 e1000e ptp pps_core i2c_algo_bit drm_kms_helper drm i2c_core
>> fjes video ipv6 [last unloaded: nf_conntrack]
>>
>> [  316.477721] irq event stamp: 2111179
>> [  316.477727] hardirqs last  enabled at (2111179): [<ffffffff8113c347>]
>> vprintk_emit+0x3ab/0x46a
>> [  316.477730] hardirqs last disabled at (2111178): [<ffffffff8113bff8>]
>> vprintk_emit+0x5c/0x46a
>> [  316.477742] softirqs last  enabled at (2111014): [<ffffffffa0e30965>]
>> ath10k_set_key+0x136/0x602 [ath10k_core]
>> [  316.477749] softirqs last disabled at (2111012): [<ffffffffa0e30946>]
>> ath10k_set_key+0x117/0x602 [ath10k_core]
>> [  316.477751] CPU: 1 PID: 257 Comm: kworker/u8:3 Tainted: G        W  O
>> 4.4.6+ #21
>> [  316.477752] Hardware name: To be filled by O.E.M. To be filled by
>> O.E.M./HURONRIVER, BIOS 4.6.5 05/02/2012
>> [  316.477780] Workqueue: wiphy3 ieee80211_iface_work [mac80211]
>> [  316.477781] task: ffff880212d225c0 ti: ffff880212d50000 task.ti:
>> ffff880212d50000
>> [  316.477790] RIP: 0010:[<ffffffffa0e38c1b>]  [<ffffffffa0e38c1b>]
>> ath10k_mac_tx_push_pending+0xc1/0x12d [ath10k_core]
>
> Just in case, do you have these applied?
>
>   750eeed89cf3 ath10k: fix pull-push tx threshold handling
>   9d71d47eed20 ath10k: fix tx hang

I have both of these...I'll try your patch below.

I first have to fix the hash-table bugs in mac80211, as they break
so many things that it is hard to test the rest of the system...

Thanks,
Ben

>
> Hmm.. If it still reproduces can you try the following diff?
>
> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -3780,6 +3780,8 @@ void ath10k_mac_tx_push_pending(struct ath10k *ar)
>                  list_del_init(&artxq->list);
>                  if (ret != -ENOENT)
>                          list_add_tail(&artxq->list, &ar->txqs);
> +               else if (artxq == last)
> +                       last = list_last_entry(&ar->txqs, struct
> ath10k_txq, list);
>
>                  ath10k_htt_tx_txq_update(hw, txq);
>
>
> Michał
>
> _______________________________________________
> ath10k mailing list
> ath10k at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k
>


-- 
Ben Greear <greearb at candelatech.com>
Candela Technologies Inc  http://www.candelatech.com




More information about the ath10k mailing list