Anyone seeing tx-credits 'hang'?

Michal Kazior michal.kazior at tieto.com
Mon Jan 12 00:06:25 PST 2015


On 9 January 2015 at 17:55, Ben Greear <greearb at candelatech.com> wrote:
[...]
> One thing I noticed yesterday is that when the driver tries to put a
> vdev down, the firmware will try to flush, and will delay vdev-down
> event until fw is flushed.  I changed CT firmware to automatically
> flush in this case, but perhaps the driver should explicitly ask
> firmware to flush the vdev before putting it down?

I recall the discussion we once had. I do plan on doing a patch for
that, eventually.


> Once the driver gets out of sync due to timeouts, the firmware
> is likely to assert soon after if wmi hang doesn't happen because
> firmware will think vdev is up when it is not, or vice versa.
>
> Also, I notice a pattern in the failure case.
>
> The sequence is almost always something like this:
>
> [lots of vdev up/down, re-associate, etc]
>
> vdev down (this would have timed out if I didn't put in the flush)
>   * vdev down is usually last wmi cmd firmware receives.
> driver tries to delete peer, that times out (firmware wmi layer never
>   saw the command)

So there's a chance htc layer actually did get the buffer but for some
reason it decided it isn't a wmi buffer. One reason could be the
buffer contained garbage (e.g. due to missing barrier on host so
firmware could read some data from an old physical address that was
stored in ce descriptor item).


> firmware reports one or two more messages to driver, and if it manages to report
> a dbglog, that shows a tx-timeout message usually within a second of
> the vdev down.  This happens whether or not I flush the vdev bringing it
> down.
>
> At this point, one more request from driver may be sent, after that,
> it is credit starvation.  Firmware continues to run (timers fire, etc).
>
> I think that firmware is also waiting on a completion event from the
> CE layer...I plan to dig into that more today.

Hm.. This reminds me of issues hw1.0 had. I'd check if one of the
workarounds ath10k had changes anything (see
ath10k_ce_src_ring_write_index_set in ce.c in 5e3dd157ce).


Michał



More information about the ath10k mailing list