[LEDE-DEV] [Make-wifi-fast] ath9k airtime fairness stabiity issues?

Dave Taht dave.taht at gmail.com
Thu Jan 5 06:51:47 PST 2017

On Thu, Jan 5, 2017 at 6:03 AM, Michal Kazior <michal.kazior at tieto.com> wrote:
> On 5 January 2017 at 14:23, Felix Fietkau <nbd at nbd.name> wrote:
>> On 2017-01-05 14:22, Loganaden Velvindron wrote:
>>> On Thu, Jan 5, 2017 at 4:59 PM, Dave Taht <dave.taht at gmail.com> wrote:
>>>> Felix:
>>>> Was there a bugreport?  (don't see one)
>>>> Do you have a specific device or behavior triggering this revert?
>>>> On Thu, Jan 5, 2017 at 4:42 AM, Dave Taht <dave.taht at gmail.com> wrote:
>>>>> https://github.com/lede-project/source/commit/c296ba834db4ce8c71e0ad7030aab188fe60b27b
>>> Hi nbd & Toke,
>>> Would it be possible to enable it only on platforms like the tp-link
>>> archer c7 v2 and the ubnt, where we have confirmed test reports for
>>> the upcoming release ?
>> I think it's quite unlikely that these issues are hardware specific.
>> It's probably more related to the environment, types of clients, or even
>> traffic patterns.
> Some people are complaining ath10k is unstable for them when
> wake_tx_queue is enabled. I suspect the ATF problem in ath9k might be
> providing extra opportunities to hit the same bug.

Hmm. I would assume most ath10k users are on a multi-core?

> I think RCU is not properly handled. txq_info shares lifecycle of
> sta_info and should therefore be protected in the same manner. When
> you queue up ieee80211_txq in a driver and use it later you
> effectively break RCU. Grabbing rcu_read_lock() *later*, e.g. when
> re-scheduling tx is not sufficient to protect from the possible race
> of part1/part2 of station destroying logic and driver accessing its
> internal txq list.

Sounds like a promising theory. Most of our testing was on single-core
devices, with the multi-core x86 version being kernel mainline
(4.8ish), and not the lede backport.

I long had mildly poor results in terms of throughput on the apu2 (x86
dual core), but assumed it was due to poor antennas. (no crashes)

The omnia is a dual core arm, but I don't have one of those.

As it turns out the UAP-lite I flashed ~2 days back is crashed right
now, and another box was failing to get dhcp addresses (why I was
looking at multicast), not even over ethernet.

(someone remind me to not take a vacation over the holidays, next time
there's holidays)

> There seems to be a mechanism to hook up with to fix that already -
> drv_sta_pre_rcu_remove().
> I've been seldom looking at the ath10k problem and noticed this bit. I
> didn't get a chance (and probably won't, any time soon) to take a
> closer look, nor test/verify it for that matter.
> Michał

Dave Täht
Let's go make home routers and wifi faster! With better software!

More information about the Lede-dev mailing list