[PATCHv2] nvme-pci: allow unmanaged interrupts

Fri May 10 17:29:23 PDT 2024

On Sat, May 11, 2024 at 07:47:26AM +0800, Ming Lei wrote:
> On Fri, May 10, 2024 at 10:46:45AM -0700, Keith Busch wrote:
> >  		map->queue_offset = qoff;
> > -		if (i != HCTX_TYPE_POLL && offset)
> > +		if (managed_irqs && i != HCTX_TYPE_POLL && offset)
> >  			blk_mq_pci_map_queues(map, to_pci_dev(dev->dev), offset);
> >  		else
> >  			blk_mq_map_queues(map);
> 
> Now the queue mapping is built with nothing from irq affinity which is
> setup from userspace, and performance could be pretty bad.

This just decouples the sw from the irq mappings. Every cpu still has a
blk-mq hctx, there's just no connection to the completing CPU if you
enable this.

Everyone expects nvme performance will suffer. IO latency and CPU
efficieny are not everyone's top priority, so allowing people to
optimize for something else seems like a reasonable request.

> Is there any benefit to use unmanaged irq in this way?

The immediate desire is more predictable scheduling on a subset of CPUs
by steering hardware interrupts somewhere else. It's the same reason
RDMA undid managed interrupts.

  231243c82793428 ("Revert "mlx5: move affinity hints assignments to generic code")

Yes, the kernel's managed interrupts are the best choice for optimizing
interaction with that device, but it's not free, and maybe you want to
exchange that optimization for something else.