[PATCH 1/2] PCI: Provide sensible irq vector alloc/free routines
Alexander Gordeev
agordeev at redhat.com
Wed May 11 02:44:32 PDT 2016
On Wed, May 11, 2016 at 10:50:49AM +0200, Christoph Hellwig wrote:
> Hi Alexander,
>
> > Moreover, patch 2/2 not only removed pci_enable_msi[x]_range()
> > internal fallback logic (from desired number of interrupts to
> > available number of interrupts), but it also removed MSI-X to
> > MSI fallback.
>
> That is done in the very begining of the function, see the quoted
> part of the patch just below:
Nope. This check verifies MSI info, but fail to call arch-specific
arch_setup_msi_irqs() function - that is where an arch actually
does MSI interrupt allocation. I.e. some PPC platforms are known
for its run-time quota which fails to allocate as many vectors as
device MSI header reports. Such conditions are circumvented with a
cycle in range family functions:
do {
/* pci_enable_msix() calls arch_setup_msi_irqs() */
rc = pci_enable_msix(dev, entries, nvec);
if (rc < 0) {
return rc;
} else if (rc > 0) {
if (rc < minvec)
return -ENOSPC;
nvec = rc;
}
} while (rc);
> > > + if (dev->msix_cap && !(flags & PCI_IRQ_NOMSIX))
> > > + nr_vecs = min_t(unsigned int, nr_vecs, pci_msix_vec_count(dev));
> > > + else if (dev->msi_cap)
> > > + nr_vecs = min_t(unsigned int, nr_vecs, pci_msi_vec_count(dev));
> > > + else
> > > + goto use_legacy_irq;
So above and below you choose either MSI-X or MSI. That is not a fallback
from MSI-X to MSI, nor most possible number of whatever type of MSI vectors
obtained from the underlying architecture. By contrast, the current NVMe
driver implementation does it:
vecs = pci_enable_msix_range(pdev, dev->entry, 1, nr_io_queues);
if (vecs < 0) {
vecs = pci_enable_msi_range(pdev, 1, min(nr_io_queues, 32));
if (vecs < 0) {
vecs = 1;
} else {
for (i = 0; i < vecs; i++)
dev->entry[i].vector = i + pdev->irq;
}
}
Again, I do not know what change to NVMe driver is needed. I am just
pointing out that the changlog for pci_alloc_irq_vectors() function is
not entirely correct and the change to NVMe driver might be undesirable.
> > > + if (dev->msix_cap && !(flags & PCI_IRQ_NOMSIX))
> > > + ret = __pci_enable_msix(dev, nr_vecs);
> > > + else
> > > + ret = __pci_enable_msi(dev, nr_vecs);
> >
> > 1. No fallbacks.
>
> I read through the code in msi.c in detail and could not find a legitimate
> case where we have msix_cap but actually fail the MSI-X allocation. If that
> is a valid case I can happily add the fallback here as well.
>
More information about the Linux-nvme
mailing list