[PATCH 2/2] nvme-pci: allow unmanaged interrupts

Benjamin Meier benjamin.meier70 at gmail.com
Mon May 13 01:59:02 PDT 2024


 > > The application which we develop and maintain (in the company I work)
 > > has very high requirements regarding latency. We have some isolated 
cores
 >
 > Are these isolated cores controlled by kernel command line `isolcpus=`?

Yes, exactly.

 > > and we run our application on those.
 > >
 > > Our system is using kernel 5.4 which unfortunately does not support
 > > "isolcpus=managed_irq". Actually, we did not even know about that
 > > option, because we are focussed on kernel 5.4. It solves part
 > > of our problem, but being able to specify where exactly interrupts
 > > are running is still superior in our opinion.
 > >
 > > E.g. assume the number of house-keeping cores is small, because we
 > > want to have full control over the system. In our case we have threads
 > > of different priorities where some get an exclusive core. Some 
other threads
 > > share a core (or a group of cores) with other threads. Now we are still
 > > happy to assign some interrupts to some of the cores which we 
consider as
 > > "medium-priority". Due to the small number of non-isolated cores, 
it can
 >
 > So these "medium-priority" cores belong to isolated cpu list, you 
still expect
 > NVMe interrupts can be handled on these cpu cores, do I understand 
correctly?

We want to avoid that the NVMe interrupts are on the "high priority" 
cores. Having
noise on them is quite bad for us, so we wanted to move some interrupts 
to house
keeping cores and if needed (due to performance issues) keep some on those
"medium-priority" isolated cores. NVMe is not that highest priority for us,
but possibly running too much on the house-keeping cores could also be bad.

 > If yes, I think your case still can be covered with 
'isolcpus=managed_irq' which
 > needn't to be same with cpu cores specified from `isolcpus=`, such as
 > excluding medium-priority cores from 'isolcpus=managed_irq', and
 > meantime include them in plain `isolcpus=`.

Unfortunately, our kernel version (5.4) does not support "managed_irq" 
and due
to that we're happy with the patch. However, I see that for newer kernel 
versions
the already existing arguments could be sufficient to do everything.

 > > be tricky to assign all interrupts to those without a 
performance-penalty.
 > >
 > > Given these requirements, manually specifying interrupt/core 
assignments
 > > would offer greater flexibility and control over system performance.
 > > Moreover, the proposed code changes appear minimal and have no
 > > impact on existing functionalities.
 >
 > Looks your main concern is performance, but as Keith mentioned, the 
proposed
 > change may degrade nvme perf too:
 >
 > 
https://lore.kernel.org/linux-nvme/Zj6745UDnwX1BteO@kbusch-mbp.dhcp.thefacebook.com/

Yes, but for NVMe it's not that critical. The most important point for us is
to keep them away from our "high-priority" cores. We still wanted to 
have control
where we run those interrupts, but also because we just did not know the 
"managed_irq"
option.

Thanks,
Benjamin



More information about the Linux-nvme mailing list