[PATCH V3] nvme-pci: allow unmanaged interrupts

Tue Jul 2 09:28:19 PDT 2024

On Tue, Jul 02, 2024 at 08:12:11PM GMT, Ming Lei wrote:
> On Tue, Jul 02, 2024 at 01:50:02PM +0200, Christoph Hellwig wrote:
> > On Tue, Jul 02, 2024 at 06:41:12PM +0800, Ming Lei wrote:
> > > From: Keith Busch <kbusch at kernel.org>
> > > 
> > > People _really_ want to control their interrupt affinity in some
> > > cases, such as Openshift with Performance profile, in which each
> > > irq's affinity is completely specified from userspace. Turns out
> > > that 'isolcpus=managed_irqs' isn't enough.
> > > 
> > > Add module parameter to allow unmanaged interrupts, just as some
> > > SCSI drivers are doing.
> > 
> > Same as before: hell no.  We can't just add hacky global kernel
> > parameters everywhere.  We need the cpu isolation infrastructure to
> > work properly instead of piling hacks of hacks in every relevant driver.
> 
> Per my understanding, here cpu isolation infrastructure can't work for
> Openshift, in which IO workload can be run on applications which are executed
> on isolated CPUs, meantime userspace do expect that interrupts can be
> triggered on user-specified CPU cores only in controllable way.  
> 
> Marcelo and Lawrence may have more input in this area.
> 
> Also irq allocation really belongs to device & driver stuff, how can that be
> hack? We even may not abstract public API in block layer for handling
> irq related thing.

I am confused. I though you told me that my series 'nvme-pci: honor
isolcpus configuration' is not necessary. But you still need this patch
to get the affinity sorted out? Wouldn't it make sense to figure out how
we can make my series working also for your use case? E.g. we could
introduce another HK type (io_queue) to control the affinity. This would
decouple if from the managed_irq option.