[PATCH] ath10k: Make HTT fill size configurable

Michal Kazior michal.kazior at tieto.com
Wed Jan 14 02:12:37 PST 2015

On 14 January 2015 at 10:50, Sujith Manoharan <sujith at msujith.org> wrote:
> Michal Kazior wrote:
>> We should just fix the tx/rx processing instead. The HTT throttling
>> limit was originally introduced to deal with watchdog issues we've
>> seen on AP135. Tasklets were starving system too much.
> Then the fill limit restriction is a workaround for AP135
> and shouldn't be applicable for other platforms ?

If I were to narrow down I'd say all uniprocessors. AP135 is just an
example where the problem easily occurs since it has an underpowered
cpu for the task. Even a more powerful single-core system will get
into trouble - all you need is a couple extra netfilter rules, nat,
some running services or additional processing hardware (usb anyone?).

>> I've been playing around with threaded irqs in ath10k in my tree and
>> I've seen improvement with Rx. However Tx instead becomes broken in
>> the process and I'm yet to find a definite and final answer why that
>> is the case. My suspicion is that NAPI, which is used by the ethernet
>> driver, runs in tasklets and they aren't frequent enough to trigger
>> ksoftirqd so they starve the system. The current non-threaded irq
>> approach yields more tasklet schedules for Tx and hits ksoftirqd more
>> often making it nice on userspace. If that's the case I don't really
>> have an idea how to solve this now.
> I haven't looked at the TX path in detail, but regarding RX, these
> were the bottlenecks:
> * RX batch indication.
> * HTT fill level.
> * netif_receive_skb() usage instead of netif_rx().
> AFAIK, the internal driver schedules only one tasklet for a CE interrupt
> and everything is done in that context, along with refilling HTT.
> ath10k has several stages: ISR -> CE tasklet -> HTT tasklet -> Replenish tasklet.
> The softirq count/load will definitely be high with so many tasklets ?

Apparently this isn't enough. Also, tasklets aren't subject to regular
scheduling policies and they just steal time from other threads. This
is important if you consider how much time a single tasklet can run -
you can actually estimate this.

800mbps is 100mbytes/s. Assume this is what host system can handle at 100% cpu.
HTT Rx ring is 1000 frames long which is ~1.5mbyte of data (assuming
1500bytes for each packet).
You eventually end up with cycles which drain entire htt rx ring and
then replenish it if you push more traffic that host cpu can take.
100mbytes / 1.5mbytes = ~66runs/s which is ~15ms for each tasklet run.
That's a lot. You might not get a chance to cycle through all the
running processes to give them their timeslice for a few seconds..

If you starve userspace which runs a watchdog process you'll end up
failing to poke the watchdog timer in kernel and you'll get a reboot.

I'm starting to think tasklets are plain evil for network drivers :P


More information about the ath10k mailing list