NVME, isolcpus, and irq affinity

Mon Oct 12 20:51:43 EDT 2020

On Mon, Oct 12, 2020 at 11:52 PM Chris Friesen
<chris.friesen at windriver.com> wrote:
>
> Hi,
>
> I'm not subscribed to the list so please CC me on replies.
>
> I've got a linux system running the RT kernel with threaded irqs.  On
> startup we affine the various irq threads to the housekeeping CPUs, but
> I recently hit a scenario where after some days of uptime we ended up
> with a number of NVME irq threads affined to application cores instead
> (not good when we're trying to run low-latency applications).
>
> Looking at the code, it appears that the NVME driver can in some
> scenarios call nvme_setup_io_queues() after the initial setup and thus
> allocate new IRQ threads at runtime.  It appears that this will then
> call pci_alloc_irq_vectors_affinity(), which seems to determine affinity
> without any regard for things like "isolcpus" or "cset shield".
>
> There seem to be other reports of similar issues:
>
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1831566
>
> Am I worried about nothing, or is there a risk that those irq threads
> would actually need to do real work (which would cause unacceptable
> jitter in my application)?
>
> Assuming I'm reading the code correctly, how does it make sense for the
> NVME driver to affine interrupts to CPUs which have explicitly been
> designated as "isolated"?

You may pass 'isolcpus=managed_irq,...' for this kind of isolation, see details
in 'isolcpus=' part of Documentation/admin-guide/kernel-parameters.txt.

And this feature is added since v5.6.

Thanks,
Ming Lei