Firmware crash when sending large numbers of forwarded packets

Avery Pennarun apenwarr at gmail.com
Sat Mar 8 03:20:28 EST 2014


On Sat, Mar 8, 2014 at 2:03 AM, Kalle Valo <kvalo at qca.qualcomm.com> wrote:
> Avery Pennarun <apenwarr at gmail.com> writes:
>> I'm having a problem where if I transmit too fast out the ath10k
>> interface in AP mode, I get a near-immediate firmware crash.
>
> [...]
>
>> Versions:
>> - kernel is based on current kvalo/for-linville branch (should I try
>> something else?) but seems to be the same in linux-next-20140114 so I
>> don't think this behaviour has changed lately.
>
> I do not recommend using for-linville branch for anything. As the name
> implies, it's only for John Linville to pull ath10k and ath6kl changes
> to his tree.
>
> What I recommend is to use the master branch of my ath.git tree. That's
> fairly recent wireless-testing (max 2 weeks old) plus latest ath10k +
> ath6kl patches I have (ie. merge of wireless-testing and my ath-next
> branch).

Ok, thanks.  We're using a fairly old kernel on our device right now
(3.2.26) so we're using the ath10k driver from linux-backports.  This
means it's a little tricky to pick an arbitrary version if it has
diverged to far from linux/master or linux-next.  I did try a few
different versions though and they did the same thing.

>> - firmware version 10.1.467.2-1, but also tested with 10.1.467.1-1
>> with no difference.
>>
>> I assume other people are not experiencing this or they would have
>> mentioned it by now.  What can I do to help debug this?
>
> We have reported the issue to the firmware team and got some feedback
> already. Hopefully we know more early next week.

Thanks!

Another update.  On a whim, based on the earlier mention that problems
might be related to extra burstiness of forwarding vs. local traffic
generation, I decided to add a udelay() before transmitting each
packet.  I started with udelay(1000) and the problem went away
(although of course performance was terrible).  I slowly reduced the
delay until I reached ndelay(1), and the problem stayed gone.  So I
tried a mb() instead:

diff --git a/drivers/net/wireless/ath/ath10k/ce.c
b/drivers/net/wireless/ath/ath10k/ce.c
index a79499c..a808d82 100644
--- a/drivers/net/wireless/ath/ath10k/ce.c
+++ b/drivers/net/wireless/ath/ath10k/ce.c
@@ -291,6 +291,7 @@ int ath10k_ce_send_nolock(struct ath10k_ce_pipe *ce_state,
  if (ret)
  return ret;

+ mb();
  if (unlikely(CE_RING_DELTA(nentries_mask,
    write_index, sw_index - 1) <= 0)) {
  ret = -ENOSR;
-- 
1.9.0.279.gdc9e3eb


Somehow this eliminates my firmware crashes.  It's extremely reliable;
add this line and my crashes go away.  Remove this line and my UDP
iperf can crash the firmware in a couple of seconds.

For this particular test I was using a backports built from linux
v3.11.8 merged with your ath10k-stable-3.11-8 tag.

Any idea why this would make any difference?

Thanks,

Avery



More information about the ath10k mailing list