Deadlock on (faked) firmware crash, CUS239, modified 10.4.3 firmware.
Michal Kazior
michal.kazior at tieto.com
Tue Mar 29 01:14:22 PDT 2016
On 26 March 2016 at 03:27, Ben Greear <greearb at candelatech.com> wrote:
> I've been seeing this for a while now. When firmware crashes, often the OS
> at least
> partially locks up.
>
> This is modified 4.4.6 driver/kernel, modified 10.4.3 firmware. I had 35
> stations associated,
> and reset one. Flush fails (maybe because nothing stops tx on other vdevs
> while flushing one?)
> and I added a fake firmware crash even in case flush fails.
>
> Then, I get deadlock. I've seen other similar deadlocks when the firmware
> crashed due
> to 'natural' causes when adding vdevs....
>
> Looks like the same process is not actually stuck in one place...each time
> the kernel splats,
> it is in a different place..spinning and spinning. Maybe it needs a
> bail-out on firmware
> crash?
[...]
> [ 316.477677] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s!
> [kworker/u8:3:257]
> [ 316.477720] Modules linked in: nf_conntrack_netlink nf_conntrack
> nfnetlink nf_defrag_ipv4 8021q garp mrp stp llc bnep bluetooth fuse macvlan
> wanlink(O) pktgen rpcsec_gss_krb5 nfsv4 nfs fscache iTCO_wdt
> iTCO_vendor_support coretemp ath9k ath10k_pci hwmon ath9k_common ath10k_core
> ath9k_hw intel_rapl iosf_mbi ath x86_pkg_temp_thermal intel_powerclamp
> mac80211 kvm_intel kvm joydev irqbypass pcspkr serio_raw cfg80211
> snd_hda_codec_hdmi lpc_ich i2c_i801 snd_hda_codec_realtek
> snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep
> snd_seq snd_seq_device snd_pcm 8250_fintek snd_timer snd shpchp soundcore
> tpm_tis tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc ata_generic
> pata_acpi i915 e1000e ptp pps_core i2c_algo_bit drm_kms_helper drm i2c_core
> fjes video ipv6 [last unloaded: nf_conntrack]
>
> [ 316.477721] irq event stamp: 2111179
> [ 316.477727] hardirqs last enabled at (2111179): [<ffffffff8113c347>]
> vprintk_emit+0x3ab/0x46a
> [ 316.477730] hardirqs last disabled at (2111178): [<ffffffff8113bff8>]
> vprintk_emit+0x5c/0x46a
> [ 316.477742] softirqs last enabled at (2111014): [<ffffffffa0e30965>]
> ath10k_set_key+0x136/0x602 [ath10k_core]
> [ 316.477749] softirqs last disabled at (2111012): [<ffffffffa0e30946>]
> ath10k_set_key+0x117/0x602 [ath10k_core]
> [ 316.477751] CPU: 1 PID: 257 Comm: kworker/u8:3 Tainted: G W O
> 4.4.6+ #21
> [ 316.477752] Hardware name: To be filled by O.E.M. To be filled by
> O.E.M./HURONRIVER, BIOS 4.6.5 05/02/2012
> [ 316.477780] Workqueue: wiphy3 ieee80211_iface_work [mac80211]
> [ 316.477781] task: ffff880212d225c0 ti: ffff880212d50000 task.ti:
> ffff880212d50000
> [ 316.477790] RIP: 0010:[<ffffffffa0e38c1b>] [<ffffffffa0e38c1b>]
> ath10k_mac_tx_push_pending+0xc1/0x12d [ath10k_core]
Just in case, do you have these applied?
750eeed89cf3 ath10k: fix pull-push tx threshold handling
9d71d47eed20 ath10k: fix tx hang
Hmm.. If it still reproduces can you try the following diff?
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -3780,6 +3780,8 @@ void ath10k_mac_tx_push_pending(struct ath10k *ar)
list_del_init(&artxq->list);
if (ret != -ENOENT)
list_add_tail(&artxq->list, &ar->txqs);
+ else if (artxq == last)
+ last = list_last_entry(&ar->txqs, struct
ath10k_txq, list);
ath10k_htt_tx_txq_update(hw, txq);
Michał
More information about the ath10k
mailing list