htt rx stopped. cannot recover

Pushpal Sidhu psidhu at gateworks.com
Mon Nov 17 17:20:45 PST 2014


On Fri, Nov 14, 2014 at 12:31 AM, Michal Kazior <michal.kazior at tieto.com> wrote:
> I've used ath10k AP in a bridge many times and haven't seen this issue yet.

I just found that I can reproduce the problem passing traffic through
from AP <-wifi-> STA with no bridging. The reason I didn't see the
problem before is that it takes considerably longer to reproduce and I
wasn't necessarily looking for it (65 seconds vs 5 seconds).

> Putting an interface into a bridge enables promiscuous mode on it. In
> ath10k this is handled by creating a monitor vdev to implicitly
> influence Rx filters so that hw/fw pass everything up to host. I
> suspect your RF environment may contain particular traffic patterns
> which trigger the problem within firmware code related to monitor
> vdev.
>
> You could try hacking up ath10k to not create/start monitor vdev and
> see if you can still reproduce the problem. Keep in mind bridging will
> be crippled in some cases since firmware won't deliver some frames to
> ath10k.

I've actually been using attenuators for these tests, a little more
than 60dB on each chain and as I mentioned above, I don't believe it's
a vdev issue due to the fact that I can reproduce this without
bridging (which doesn't put the radio into promiscuous mode and thus
doesn't create a monitor vdev, correct?).

> Hmm.. Now that I think about it, after the recent Rx rework it might
> be possible to circumvent the problem. ath10k uses very little data
> from Rx indication event and instead Rx descriptor is used for most
> things. Popping function would need some changes so that it can back
> off safely if a frame buffer isn't ready. ath10k would probably need
> to poll for Rx too.

I saw those patches and I liked what I saw. When I tested with them in
place, I found it failed in the same 'test' of checking for MSDU_DONE
flag (duh, just wanted to make that clear, haha). I tried not setting
the confused flag when the issue was hit, but eventually a kernel
panic would occur.

Something very interesting that I found was I only see this issue when
the radio's are behind a PCIe bridge. That is, I tested on two set's
of boards with the same CPU/structure, with the one difference that
there is no PCIe bridge on one. I am currently running an extended
test on the board without PCIe to see if it isn't an issue that will
manifest itself at a later time, but as of now, it's looking like
that's no the case. Do you have any thoughts on this?

- Pushpal



More information about the ath10k mailing list