Unicast packets stop being transmitted to a particular station, under load, when WPA2 is enabled

Dave Taht dave.taht at gmail.com
Sun May 11 19:46:10 PDT 2014


On Sun, May 11, 2014 at 7:42 PM, Dave Taht <dave.taht at gmail.com> wrote:
> On Sun, May 11, 2014 at 7:29 PM, Avery Pennarun <apenwarr at gmail.com> wrote:
>> On Sun, May 11, 2014 at 10:07 PM, Dave Taht <dave.taht at gmail.com> wrote:
>>> I have been failing to find and fix a very similar problem on the
>>> ath9k for many months now. What I see happening there is that one or
>>> more of the
>>> hardware queues locks up, and stops transmitting traffic. So, for
>>> example I might get traffic destined for the BK (background queue,
>>> traffic marked CS1) hung,
>>> but BE remains fine. Most recently I was able to lock up the VO, VI
>>> AND BK queues by exercising it overnight with multiple copies of the
>>> rrul test.
>>>
>>> I don't know much about how the hardware queues are configured on
>>> ath10k, but you can land stuff in each queue by marking with CS0, CS1,
>>> CS5, and CS7 (BE,BK,VI,VO) on mac80211 based devices.
>>
>> I think my problem may be something else.  In particular, it seems to
>> affect each station separately, and doesn't seem to happen if I
>> disable encryption.  (Does your ath9k problem trigger if encryption is
>> turned off?)
>
> No. WPA2 only so far.
>
> I will try multiple stations to see if I can get it to occur only on a
> per-station basis. (there are hardware queues for multiple forms of
> traffic not just the visible VO, VI, BE, and BK queues)
>
>> I also have an ath9k device in the same AP on 2.4 GHz,
>> and it doesn't trigger there either.  I haven't attempted to see if
>> your bug triggers on that one though :)
>
> It really takes work to trigger it, and I can can now do it on both
> 2.4ghz and 5. Getting it down to under 6 hours of high traffic
> recently was an accomplishment.
>
>>> I can make it happen more often, faster, if the associated station has
>>> considerable distance and less signal strength than nearby.
>
> There are not often executed code paths controlling how noise rejection
> works, and all sorts of hardware issues on configuring it that vary between
> chipset versions. Ton of patches had landed in head that had an update
> to the ANI values
> that worked on newer versions of the ath9k chipset that later had to be modified
> to deal with older ath9k chipsets.
>
>> I just checked, and my bug seems to trigger more often when I'm at a
>> longer distance (my macbook says about -60 RSSI) and less often at a
>> closer distance (currently macbook reports RSSI of -41).  Not sure if
>> this is related to increased retransmits or decreased speed or
>> something else.
>>
>>> Blow it up with netperf-wrappers -H someserver rrul...
>>
>> That's not a bad idea... I really need to get netperf-wrappers going
>> for some stress testing :)
>
> The hardware queues are rarely tested.
>
> If you just want to blow up one queue at a time, the syntax for netperf is

netperf -H someserver -t the_test -Y CS1,CS1 # or CS5,CS5 or CS6, CS6

I have been flooding all the queues with both -t TCP_STREAM and TCP_MAERTS
to make it happen using the rrul test, but I have also made it happen
with BE only.

Getting one data point every day or so makes for slow debugging.

>
> You can also arbitrarily do tos-setting with iptables.
>
> dnsmasq uses CS6 by default, btw, so it's DHCP packets land by default
> in VO and then get shuffled over to the multicast hw queue.
>
>
>>
>> Have fun,
>>
>> Avery
>
>
>
> --
> Dave Täht
>
> NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article



-- 
Dave Täht

NSFW: https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article



More information about the ath10k mailing list