pci_alloc_irq_vectors fails ENOSPC for XPS 13 9310

wi nk wink at technolu.st
Wed Nov 18 05:22:39 EST 2020


On Tue, Nov 17, 2020 at 9:59 PM Thomas Gleixner <tglx at linutronix.de> wrote:
>
> On Tue, Nov 17 2020 at 16:49, wi nk wrote:
> > On Sun, Nov 15, 2020 at 8:55 PM wi nk <wink at technolu.st> wrote:
> > So up until this point, everything is working without issues.
> > Everything seems to spiral out of control a couple of seconds later
> > when my system attempts to actually bring up the adapter.  In most of
> > the crash states I will see this:
> >
> > [   31.286725] wlp85s0: send auth to ec:08:6b:27:01:ea (try 1/3)
> > [   31.390187] wlp85s0: send auth to ec:08:6b:27:01:ea (try 2/3)
> > [   31.391928] wlp85s0: authenticated
> > [   31.394196] wlp85s0: associate with ec:08:6b:27:01:ea (try 1/3)
> > [   31.396513] wlp85s0: RX AssocResp from ec:08:6b:27:01:ea
> > (capab=0x411 status=0 aid=6)
> > [   31.407730] wlp85s0: associated
> > [   31.434354] IPv6: ADDRCONF(NETDEV_CHANGE): wlp85s0: link becomes ready
> >
> > And then either somewhere in that pile of messages, or a second or two
> > after this my machine will start to stutter as I mentioned before, and
> > then it either hangs, or I see this message (I'm truncating the
> > timestamp):
> >
> > [   35.xxxx ] sched: RT throttling activated
>
> As this driver uses threaded interrupts, this looks like an interrupt
> storm and the interrupt thread consumes the CPU fully. The RT throttler
> limits the RT runtime of it which allows other tasks make some
> progress. That's what you observe as stutter.
>
> You can apply the hack below so the irq thread(s) run in the SCHED_OTHER
> class which prevents them from monopolizing the CPU. That might make the
> problem simpler to debug.
>
> Thanks,
>
>         tglx
> ---
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index c460e0496006..8473ecacac7a 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1320,7 +1320,7 @@ setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary)
>         if (IS_ERR(t))
>                 return PTR_ERR(t);
>
> -       sched_set_fifo(t);
> +       //sched_set_fifo(t);
>
>         /*
>          * We keep the reference to the task struct even if

I was able to apply this patch and play a little bit.  Unfortunately,
whatever is still going on is mostly the same.  It seems this patch
extends the 'stuttering' I see a little bit, but the end result is
still an unresponsive machine.  I didn't get tons of time to play yet,
so the extra time may make it possible to finally get sysrq-c issued
and get a vmcore dump.  I also tried to replicate a google android
patch I found to basically BUG() on the rt throttling activating
(https://groups.google.com/a/chromium.org/g/chromium-os-reviews/c/NDyPucYrvRY)
but that path hasn't activated for me since I booted it.  I'll
hopefully have a chance again this evening.



More information about the ath11k mailing list