[RFTv2 2/5] ath10k: fix wmi-htc tx credit starvation
Matti Laakso
malaakso at elisanet.fi
Wed Feb 4 02:57:17 PST 2015
> On 29 January 2015 at 02:32, YanBo <dreamfly281 at gmail.com <http://lists.infradead.org/mailman/listinfo/ath10k>> wrote:
> >/ Hi Michal,
> />/
> />/ What the conclusion about this patch, it looks like this patch not be
> />/ merged into ath10K due to introduce some unstable issue, I'v got
> />/ another issue that when move the station enter hibernate mode. the AP
> />/ will continue report message like before
> />/ [ 3958.681293] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> />/ [ 3959.681449] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> />/ [ 3960.681696] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> />/ [ 3961.681877] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> />/ [ 3962.682080] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> />/ [ 3963.682361] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> />/ [ 3964.682550] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> />/ [ 3965.682743] ath10k_pci 0000:01:00.0: Spurious quick kickout for STA
> />/ 00:03:7f:40:04:5b
> /
> The spurious STA kickout alone is most likely an aftermath of HTX Tx
> credit starvation when client was detected as inactive by hostapd and
> was subsequently disassociated. However due to starvation
> wmi-peer-delete was never sent to firmware so fw thinks the peer is
> still there.
>
> I suppose fw should be restarted when ath10k is unable to submit a
> configuration command like wmi-peer-delete. It doesn't make sense to
> continue since fw-host state loses coherency and weird things can
> start to happen (spurious sta kickout is the best known example).
>
Hi Michał,
We've received some bug reports in OpenWrt (ath10k is from last
November, firmware-3.bin_10.2-00082-4-2) about a similar issue (see e.g.
https://dev.openwrt.org/ticket/18794 ), where spurious sta kickouts are
reported, eventually leading to "number of peers exceeded" and the
inability to connect more clients. After this happens a network restart
usually causes an ath10k firmware crash. The clients that cause this are
always mobile devices which regularly go in and out of the AP range. In
my case this usually starts to happen after a couple of days' uptime.
Do you think this is the same issue? Is there something I could do to
help eventually fix this?
Matti
> >/ and there are also error message like this be happened at early time:
> />/
> />/
> />/ [ 1316.883053] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0
> />/
> />/ [ 1316.912357] ath10k_pci 0000:01:00.0: failed to transmit management
> />/ frame via WMI: -11
> />/
> />/ [ 1316.985476] ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0
> />/
> />/ I suspect it is triggered as you mentioned because the HTC Tx credits
> />/ are drained
> />/ to 0 and no other commands can be submitted, if the answer is yes,
> />/ I'd hear your suggestion about whether this patch still worth to be
> />/ continue improve to solve such kinds of issue.
> /
> Yep, looks like the starvation issue.
>
> The problem with the patch is it creates ugly latencies. This has been
> reported by Avery[1] (he used/uses this patch internally for his
> purposes).
>
> Ideally mgmt frames should be sent via HTT. 10.2 is capable of sending
> raw frames via HTT so it might be possible to utilize that and forgo
> WMI mgmt tx for 10.2+. I did a proof-of-concept for raw tx on 10.2
> some time ago [2] but I'm haven't tested how it interacts with
> powersave buffering.
>
>
> [1]:http://thread.gmane.org/gmane.linux.drivers.ath10k.devel/638
> [2]:http://thread.gmane.org/gmane.linux.drivers.ath10k.devel/246
>
>
> Michał
More information about the ath10k
mailing list