[PATCH 14/16] PCI: hv: Switch to msi_create_parent_irq_domain()
Nam Cao
namcao at linutronix.de
Sat Jul 5 02:46:55 PDT 2025
On Sat, Jul 05, 2025 at 03:51:48AM +0000, Michael Kelley wrote:
> From: Nam Cao <namcao at linutronix.de> Sent: Thursday, June 26, 2025 7:48 AM
> >
> > Move away from the legacy MSI domain setup, switch to use
> > msi_create_parent_irq_domain().
>
> With the additional tweak to this patch that you supplied separately,
> everything in my testing on both x86 and arm64 seems to work OK. So
> that's all good.
>
> On arm64, I did notice the following IRQ domain information from
> /sys/kernel/debug/irq/domains:
>
> # cat HV-PCI-MSIX-1e03\:00\:00.0-12
> name: HV-PCI-MSIX-1e03:00:00.0-12
> size: 0
> mapped: 7
> flags: 0x00000213
> IRQ_DOMAIN_FLAG_HIERARCHY
> IRQ_DOMAIN_NAME_ALLOCATED
> IRQ_DOMAIN_FLAG_MSI
> IRQ_DOMAIN_FLAG_MSI_DEVICE
> parent: 5D202AA8-1E03-4F0F-A786-390A0D2749E9-3
> name: 5D202AA8-1E03-4F0F-A786-390A0D2749E9-3
> size: 0
> mapped: 7
> flags: 0x00000103
> IRQ_DOMAIN_FLAG_HIERARCHY
> IRQ_DOMAIN_NAME_ALLOCATED
> IRQ_DOMAIN_FLAG_MSI_PARENT
> parent: hv_vpci_arm64
> name: hv_vpci_arm64
> size: 956
> mapped: 31
> flags: 0x00000003
> IRQ_DOMAIN_FLAG_HIERARCHY
> IRQ_DOMAIN_NAME_ALLOCATED
> parent: irqchip at 0x00000000ffff0000-1
> name: irqchip at 0x00000000ffff0000-1
> size: 0
> mapped: 47
> flags: 0x00000003
> IRQ_DOMAIN_FLAG_HIERARCHY
> IRQ_DOMAIN_NAME_ALLOCATED
>
> The 5D202AA8-1E03-4F0F-A786-390A0D2749E9-3 domain has
> IRQ_DOMAIN_FLAG_MSI_PARENT set. But the hv_vpci_arm64
> and irqchip at ... domains do not. Is that a problem? On x86,
> the output is this, with IRQ_DOMAIN_FLAG_MSI_PARENT set
> in the next level up VECTOR domain:
That looks normal. IRQ_DOMAIN_FLAG_MSI_PARENT is set for domains which
provide MSI parent domain capability, which happens to be the case for x86
vector.
> # cat HV-PCI-MSIX-6b71\:00\:02.0-12
> name: HV-PCI-MSIX-6b71:00:02.0-12
> size: 0
> mapped: 17
> flags: 0x00000213
> IRQ_DOMAIN_FLAG_HIERARCHY
> IRQ_DOMAIN_NAME_ALLOCATED
> IRQ_DOMAIN_FLAG_MSI
> IRQ_DOMAIN_FLAG_MSI_DEVICE
> parent: 8564CB14-6B71-477C-B189-F175118E6FF0-3
> name: 8564CB14-6B71-477C-B189-F175118E6FF0-3
> size: 0
> mapped: 17
> flags: 0x00000103
> IRQ_DOMAIN_FLAG_HIERARCHY
> IRQ_DOMAIN_NAME_ALLOCATED
> IRQ_DOMAIN_FLAG_MSI_PARENT
> parent: VECTOR
> name: VECTOR
> size: 0
> mapped: 67
> flags: 0x00000103
> IRQ_DOMAIN_FLAG_HIERARCHY
> IRQ_DOMAIN_NAME_ALLOCATED
> IRQ_DOMAIN_FLAG_MSI_PARENT
>
> Finally, I've noted a couple of code review comments below. These
> comments may reflect my lack of fully understanding the MSI
> IRQ handling, in which case, please set me straight. Thanks,
>
> Michael
>
> >
> > Signed-off-by: Nam Cao <namcao at linutronix.de>
> > ---
> > Cc: K. Y. Srinivasan <kys at microsoft.com>
> > Cc: Haiyang Zhang <haiyangz at microsoft.com>
> > Cc: Wei Liu <wei.liu at kernel.org>
> > Cc: Dexuan Cui <decui at microsoft.com>
> > Cc: linux-hyperv at vger.kernel.org
> > ---
> > drivers/pci/Kconfig | 1 +
> > drivers/pci/controller/pci-hyperv.c | 98 +++++++++++++++++++++++------
> > 2 files changed, 80 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> > index 9c0e4aaf4e8cb..9a249c65aedcd 100644
> > --- a/drivers/pci/Kconfig
> > +++ b/drivers/pci/Kconfig
> > @@ -223,6 +223,7 @@ config PCI_HYPERV
> > tristate "Hyper-V PCI Frontend"
> > depends on ((X86 && X86_64) || ARM64) && HYPERV && PCI_MSI && SYSFS
> > select PCI_HYPERV_INTERFACE
> > + select IRQ_MSI_LIB
> > help
> > The PCI device frontend driver allows the kernel to import arbitrary
> > PCI devices from a PCI backend to support PCI driver domains.
> > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> > index ef5d655a0052c..3a24fadddb83b 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -44,6 +44,7 @@
> > #include <linux/delay.h>
> > #include <linux/semaphore.h>
> > #include <linux/irq.h>
> > +#include <linux/irqchip/irq-msi-lib.h>
> > #include <linux/msi.h>
> > #include <linux/hyperv.h>
> > #include <linux/refcount.h>
> > @@ -508,7 +509,6 @@ struct hv_pcibus_device {
> > struct list_head children;
> > struct list_head dr_list;
> >
> > - struct msi_domain_info msi_info;
> > struct irq_domain *irq_domain;
> >
> > struct workqueue_struct *wq;
> > @@ -1687,7 +1687,7 @@ static void hv_msi_free(struct irq_domain *domain, struct msi_domain_info *info,
> > struct msi_desc *msi = irq_data_get_msi_desc(irq_data);
> >
> > pdev = msi_desc_to_pci_dev(msi);
> > - hbus = info->data;
> > + hbus = domain->host_data;
> > int_desc = irq_data_get_irq_chip_data(irq_data);
> > if (!int_desc)
> > return;
> > @@ -1705,7 +1705,6 @@ static void hv_msi_free(struct irq_domain *domain, struct msi_domain_info *info,
> >
> > static void hv_irq_mask(struct irq_data *data)
> > {
> > - pci_msi_mask_irq(data);
> > if (data->parent_data->chip->irq_mask)
> > irq_chip_mask_parent(data);
> > }
> > @@ -1716,7 +1715,6 @@ static void hv_irq_unmask(struct irq_data *data)
> >
> > if (data->parent_data->chip->irq_unmask)
> > irq_chip_unmask_parent(data);
> > - pci_msi_unmask_irq(data);
> > }
> >
> > struct compose_comp_ctxt {
> > @@ -2101,6 +2099,44 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
> > msg->data = 0;
> > }
> >
> > +static bool hv_pcie_init_dev_msi_info(struct device *dev, struct irq_domain *domain,
> > + struct irq_domain *real_parent, struct msi_domain_info *info)
> > +{
> > + struct irq_chip *chip = info->chip;
> > +
> > + if (!msi_lib_init_dev_msi_info(dev, domain, real_parent, info))
> > + return false;
> > +
> > + info->ops->msi_prepare = hv_msi_prepare;
> > +
> > + chip->irq_set_affinity = irq_chip_set_affinity_parent;
> > +
> > + if (IS_ENABLED(CONFIG_X86))
> > + chip->flags |= IRQCHIP_MOVE_DEFERRED;
> > +
> > + return true;
> > +}
> > +
> > +#define HV_PCIE_MSI_FLAGS_REQUIRED (MSI_FLAG_USE_DEF_DOM_OPS | \
> > + MSI_FLAG_USE_DEF_CHIP_OPS | \
> > + MSI_FLAG_PCI_MSI_MASK_PARENT)
> > +#define HV_PCIE_MSI_FLAGS_SUPPORTED (MSI_FLAG_MULTI_PCI_MSI | \
> > + MSI_FLAG_PCI_MSIX | \
> > + MSI_GENERIC_FLAGS_MASK)
> > +
> > +static const struct msi_parent_ops hv_pcie_msi_parent_ops = {
> > + .required_flags = HV_PCIE_MSI_FLAGS_REQUIRED,
> > + .supported_flags = HV_PCIE_MSI_FLAGS_SUPPORTED,
> > + .bus_select_token = DOMAIN_BUS_PCI_MSI,
> > +#ifdef CONFIG_X86
> > + .chip_flags = MSI_CHIP_FLAG_SET_ACK,
> > +#elif defined(CONFIG_ARM64)
> > + .chip_flags = MSI_CHIP_FLAG_SET_EOI,
> > +#endif
> > + .prefix = "HV-",
> > + .init_dev_msi_info = hv_pcie_init_dev_msi_info,
> > +};
> > +
> > /* HW Interrupt Chip Descriptor */
> > static struct irq_chip hv_msi_irq_chip = {
> > .name = "Hyper-V PCIe MSI",
> > @@ -2108,7 +2144,6 @@ static struct irq_chip hv_msi_irq_chip = {
> > .irq_set_affinity = irq_chip_set_affinity_parent,
> > #ifdef CONFIG_X86
> > .irq_ack = irq_chip_ack_parent,
> > - .flags = IRQCHIP_MOVE_DEFERRED,
> > #elif defined(CONFIG_ARM64)
> > .irq_eoi = irq_chip_eoi_parent,
> > #endif
>
> Would it work to drop the #ifdef's and always set both .irq_ack and
> .irq_eoi on x86 and on ARM64? Is which one gets called controlled by the
> child HV-PCI-MSIX- ... domain, based on the .chip_flags?
>
> I'm trying to reduce the #ifdef clutter. I
> tested without the #ifdefs on both x86 and arm64, and
> everything works, but I know that doesn't prove that it's
> OK.
Nothing is wrong with that, as far as I can tell.
> If the #ifdefs can go away, then I'd like to see a tweak to the way
> .chip_flags is set. Rather than do an #ifdef inline for struct
> msi_parent_ops hv_pcie_msi_parent_ops, add a #define
> HV_MSI_CHIP_FLAGS in the existing #ifdef X86 and #ifdef ARM64
> sections respectively near the top of this source file, and then
> use HV_MSI_CHIP_FLAGS in struct msi_parent_ops
> hv_pcie_msi_parent_ops. As much as is reasonable, I'd like to
> not clutter the code with #ifdef X86 #elseif ARM64, but instead
> group all the differences under the existing #ifdefs near the top.
> There are some places where this isn't practical, but this seems
> like a place that is practical.
Yes, that would be better. I will do it in v2.
> > @@ -2116,9 +2151,37 @@ static struct irq_chip hv_msi_irq_chip = {
> > .irq_unmask = hv_irq_unmask,
> > };
> >
> > -static struct msi_domain_ops hv_msi_ops = {
> > - .msi_prepare = hv_msi_prepare,
> > - .msi_free = hv_msi_free,
> > +static int hv_pcie_domain_alloc(struct irq_domain *d, unsigned int virq, unsigned int nr_irqs,
> > + void *arg)
> > +{
> > + /* TODO: move the content of hv_compose_msi_msg() in here */
>
> Could you elaborate on this TODO? Is the idea to loop through all the IRQs and
> generate the MSI message for each one? What is the advantage to doing it here?
> I noticed in Patch 3 of the series, the Aardvark controller has
> advk_msi_irq_compose_msi_msg(), but you had not moved it into the domain
> allocation path.
Sorry for being unclear. hv_compose_msi_msg() should not be moved here
entirely. Let me elaborate this in v2.
What I meant is that, hv_compose_msi_msg() is doing more than what this
callback is supposed to do (composing message). It works, but it is not
correct. Interrupt allocation is the responsibility of
irq_domain_ops::alloc(). Allocating and populating int_desc should be in
hv_pcie_domain_alloc() instead.
irq_domain_ops's .alloc() and .free() should be asymmetric.
>
> Also, is there some point in the time in the future where the "TODO" is likely to
> become a "MUST DO"?
There's nothing planned that would make this non-functional, as far as I
know.
Thanks so much for examining the patch,
Nam
More information about the Linux-mediatek
mailing list