[PATCHv2] nvme-pci: allow unmanaged interrupts

Keith Busch kbusch at kernel.org
Sun May 12 15:05:19 PDT 2024


On Sun, May 12, 2024 at 05:16:13PM +0300, Sagi Grimberg wrote:
> 
> > > Everyone expects nvme performance will suffer. IO latency and CPU
> > > efficieny are not everyone's top priority, so allowing people to
> > > optimize for something else seems like a reasonable request.
> > I guess more people may be interested in 'something else', care to share
> > them in the commit log, cause nvme is going to support it.
> 
> I don't have a special interest in this, but I can share what I heard
> several
> times. The use-case is that people want to dedicate a few cores to handle
> interrupts so they know it does not take cpu time from their application
> threads
> that are running (usually pinned to different cores).
> 
> The app threads isolation is more important to them than affinity to the
> device...

Yes, that is consistently the same reasoning I've heard. While managed
irq is overwhelmingly the best choice for most use cases, it's clearly
been communicated that some users do not want it for exactly this
reason.

As far as I can tell, there's no techincal reason to prevent letting
people make that choice. This "kernel knows better than you" argument is
less sustainable than letting users do whatever they want with their
CPUs.

> > > > Is there any benefit to use unmanaged irq in this way?
> > > The immediate desire is more predictable scheduling on a subset of CPUs
> > > by steering hardware interrupts somewhere else. It's the same reason
> > > RDMA undid managed interrupts.
> > > 
> > >    231243c82793428 ("Revert "mlx5: move affinity hints assignments to generic code")
> > The above commit only mentions it becomes not flexible since user can't
> > adjust irq affinity any more.
> > 
> > It is understandable for network, there is long history people need to adjust
> > irq affinity from user space.
> 
> I suspect that the reasoning is similar to nvme as well.

+1, exactly.



More information about the Linux-nvme mailing list