[PATCH 2/2] nvme-pci: allow unmanaged interrupts

Ming Lei ming.lei at redhat.com
Mon May 13 02:25:09 PDT 2024


On Mon, May 13, 2024 at 10:59:02AM +0200, Benjamin Meier wrote:
> > > The application which we develop and maintain (in the company I work)
> > > has very high requirements regarding latency. We have some isolated
> cores
> >
> > Are these isolated cores controlled by kernel command line `isolcpus=`?
> 
> Yes, exactly.
> 
> > > and we run our application on those.
> > >
> > > Our system is using kernel 5.4 which unfortunately does not support
> > > "isolcpus=managed_irq". Actually, we did not even know about that
> > > option, because we are focussed on kernel 5.4. It solves part
> > > of our problem, but being able to specify where exactly interrupts
> > > are running is still superior in our opinion.
> > >
> > > E.g. assume the number of house-keeping cores is small, because we
> > > want to have full control over the system. In our case we have threads
> > > of different priorities where some get an exclusive core. Some other
> threads
> > > share a core (or a group of cores) with other threads. Now we are still
> > > happy to assign some interrupts to some of the cores which we consider
> as
> > > "medium-priority". Due to the small number of non-isolated cores, it can
> >
> > So these "medium-priority" cores belong to isolated cpu list, you still
> expect
> > NVMe interrupts can be handled on these cpu cores, do I understand
> correctly?
> 
> We want to avoid that the NVMe interrupts are on the "high priority" cores.
> Having
> noise on them is quite bad for us, so we wanted to move some interrupts to
> house
> keeping cores and if needed (due to performance issues) keep some on those
> "medium-priority" isolated cores. NVMe is not that highest priority for us,
> but possibly running too much on the house-keeping cores could also be bad.
> 
> > If yes, I think your case still can be covered with 'isolcpus=managed_irq'
> which
> > needn't to be same with cpu cores specified from `isolcpus=`, such as
> > excluding medium-priority cores from 'isolcpus=managed_irq', and
> > meantime include them in plain `isolcpus=`.
> 
> Unfortunately, our kernel version (5.4) does not support "managed_irq" and
> due
> to that we're happy with the patch. However, I see that for newer kernel
> versions
> the already existing arguments could be sufficient to do everything.

'isolcpus=managed_irq' enablement patches are small, and shouldn't be very
hard to backport.

> 
> > > be tricky to assign all interrupts to those without a
> performance-penalty.
> > >
> > > Given these requirements, manually specifying interrupt/core assignments
> > > would offer greater flexibility and control over system performance.
> > > Moreover, the proposed code changes appear minimal and have no
> > > impact on existing functionalities.
> >
> > Looks your main concern is performance, but as Keith mentioned, the
> proposed
> > change may degrade nvme perf too:
> >
> > https://lore.kernel.org/linux-nvme/Zj6745UDnwX1BteO@kbusch-mbp.dhcp.thefacebook.com/
> 
> Yes, but for NVMe it's not that critical. The most important point for us is
> to keep them away from our "high-priority" cores. We still wanted to have
> control
> where we run those interrupts, but also because we just did not know the
> "managed_irq"
> option.

OK, thanks for share the input!

Now from upstream viewpoint, 'isolcpus=managed_irq' should work for your case,
and seems not necessary to support nvme unmanaged irq for this requirement
at least.


thanks,
Ming




More information about the Linux-nvme mailing list