Anyone seeing tx-credits 'hang'?
Michal Kazior
michal.kazior at tieto.com
Tue Jan 20 23:22:27 PST 2015
On 20 January 2015 at 05:34, Ben Greear <greearb at candelatech.com> wrote:
> Ok, so I think I've mostly got this figured out...at least enough to
> work around the problem.
>
> It seems that the firmware and/or NIC hardware stops doing CE interrupts
> for the WMI rings (at least). If I force a poll of
> the rings, then packets are found and may be processed.
So you just keep calling ath10k_hif_send_complete_check() (or
ath10k_ce_per_engine_service) for polling, right?
> In one case I looked at closely, it seems IRQs went away for around 30
> seconds,
> and then for no obvious reason IRQs for the rings started being delivered
> and
> processed again. ~20 WMI messages were processed due to polling CE rings in
> this
> interval.
Out of curiosity - what irq mode are you using? Shared or MSI? Or did
you try both?
> The combination of WMI keep-alive messages sent from host, and
> timer to check for timeouts (and do CE polling at higher intervals
> when timeout is detected) appears to be enough. I also check
> for the IRQ working again and stop the polling at that time.
>
> I plan to clean the firmware changes up and commit them to my
> own repo...but it will require host changes to enable the keep-alive
> to fully work around this problem. Probably none of this will make
> it upstream....
We could add a watchdog to WMI which uses the `echo` command and look
at echo events and tx credit completion (WMI is notified about that).
In case neither comes in in a timely fashion (lets say 1s which is
less than WMI command timeout of 3s) we start polling until things
settle down. This should work with standard firmware, no?
Michał
More information about the ath10k
mailing list