NVME, isolcpus, and irq affinity

Tue Oct 13 03:50:29 EDT 2020

On Tue, Oct 13, 2020 at 2:24 PM Chris Friesen
<chris.friesen at windriver.com> wrote:
>
> On 10/12/2020 6:51 PM, Ming Lei wrote:
> > On Mon, Oct 12, 2020 at 11:52 PM Chris Friesen
> > <chris.friesen at windriver.com> wrote:
> >>
> >> Hi,
> >>
> >> I'm not subscribed to the list so please CC me on replies.
> >>
> >> I've got a linux system running the RT kernel with threaded irqs.  On
> >> startup we affine the various irq threads to the housekeeping CPUs, but
> >> I recently hit a scenario where after some days of uptime we ended up
> >> with a number of NVME irq threads affined to application cores instead
> >> (not good when we're trying to run low-latency applications).
> >>
> >> Looking at the code, it appears that the NVME driver can in some
> >> scenarios call nvme_setup_io_queues() after the initial setup and thus
> >> allocate new IRQ threads at runtime.  It appears that this will then
> >> call pci_alloc_irq_vectors_affinity(), which seems to determine affinity
> >> without any regard for things like "isolcpus" or "cset shield".
> >>
> >> There seem to be other reports of similar issues:
> >>
> >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1831566
> >>
> >> Am I worried about nothing, or is there a risk that those irq threads
> >> would actually need to do real work (which would cause unacceptable
> >> jitter in my application)?
> >>
> >> Assuming I'm reading the code correctly, how does it make sense for the
> >> NVME driver to affine interrupts to CPUs which have explicitly been
> >> designated as "isolated"?
> >
> > You may pass 'isolcpus=managed_irq,...' for this kind of isolation, see details
> > in 'isolcpus=' part of Documentation/admin-guide/kernel-parameters.txt.
> >
> > And this feature is added since v5.6.
>
> I suspect that might work, unfortunately it's not available in our

Did you look at '11ea68f553e2 genirq, sched/isolation: Isolate from
handling managed interrupts'?
Which supposes to address this kind of issue exactly.

> kernel and jumping to a brand new kernel will mean a lot of additional
> validation work so it's not something we can do on a whim.

You can backport that patch to your kernel.

>
> I'm definitely looking forward to moving to something newer though.

What is the something newer you want?

Thanks,
Ming Lei