[PATCH v2 3/5] genirq/msi: Move prepare() call to per-device allocation

Zenghui Yu yuzenghui at huawei.com
Tue Jun 3 01:22:47 PDT 2025


Hi Marc,

On 2025/5/14 0:31, Marc Zyngier wrote:
> The current device MSI infrastructure is subtly broken, as it
> will issue an .msi_prepare() callback into the MSI controller
> driver every time it needs to allocate an MSI. That's pretty wrong,
> as the contract (or unwarranted assumption, depending who you ask)
> between the MSI controller and the core code is that .msi_prepare()
> is called exactly once per device.
> 
> This leads to some subtle breakage in said MSI controller drivers,
> as it gives the impression that there are multiple endpoints sharing
> a bus identifier (RID in PCI parlance, DID for GICv3+). It implies
> that whatever allocation the ITS driver (for example) has done on
> behalf of these devices cannot be undone, as there is no way to
> track the shared state. This is particularly bad for wire-MSI devices,
> for which .msi_prepare() is called for. each. input. line.
> 
> To address this issue, move the call to .msi_prepare() to take place
> at the point of irq domain allocation, which is the only place that
> makes sense. The msi_alloc_info_t structure is made part of the
> msi_domain_template, so that its life-cycle is that of the domain
> as well.
> 
> Finally, the msi_info::alloc_data field is made to point at this
> allocation tracking structure, ensuring that it is carried around
> the block.
> 
> This is all pretty straightforward, except for the non-device-MSI
> leftovers, which still have to call .msi_prepare() at the old
> spot. One day...
> 
> Signed-off-by: Marc Zyngier <maz at kernel.org>
> ---
>  include/linux/msi.h |  2 ++
>  kernel/irq/msi.c    | 35 +++++++++++++++++++++++++++++++----
>  2 files changed, 33 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/msi.h b/include/linux/msi.h
> index 63c23003ec9b7..ba1c77a829a1c 100644
> --- a/include/linux/msi.h
> +++ b/include/linux/msi.h
> @@ -516,12 +516,14 @@ struct msi_domain_info {
>   * @chip:	Interrupt chip for this domain
>   * @ops:	MSI domain ops
>   * @info:	MSI domain info data
> + * @alloc_info:	MSI domain allocation data (arch specific)
>   */
>  struct msi_domain_template {
>  	char			name[48];
>  	struct irq_chip		chip;
>  	struct msi_domain_ops	ops;
>  	struct msi_domain_info	info;
> +	msi_alloc_info_t	alloc_info;
>  };
>  
>  /*
> diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
> index 31378a2535fb9..07eb857efd15e 100644
> --- a/kernel/irq/msi.c
> +++ b/kernel/irq/msi.c
> @@ -59,7 +59,8 @@ struct msi_ctrl {
>  static void msi_domain_free_locked(struct device *dev, struct msi_ctrl *ctrl);
>  static unsigned int msi_domain_get_hwsize(struct device *dev, unsigned int domid);
>  static inline int msi_sysfs_create_group(struct device *dev);
> -
> +static int msi_domain_prepare_irqs(struct irq_domain *domain, struct device *dev,
> +				   int nvec, msi_alloc_info_t *arg);
>  
>  /**
>   * msi_alloc_desc - Allocate an initialized msi_desc
> @@ -1023,6 +1024,7 @@ bool msi_create_device_irq_domain(struct device *dev, unsigned int domid,
>  	bundle->info.ops = &bundle->ops;
>  	bundle->info.data = domain_data;
>  	bundle->info.chip_data = chip_data;
> +	bundle->info.alloc_data = &bundle->alloc_info;
>  
>  	pops = parent->msi_parent_ops;
>  	snprintf(bundle->name, sizeof(bundle->name), "%s%s-%s",
> @@ -1061,11 +1063,18 @@ bool msi_create_device_irq_domain(struct device *dev, unsigned int domid,
>  	if (!domain)
>  		return false;
>  
> +	domain->dev = dev;
> +	dev->msi.data->__domains[domid].domain = domain;
> +
> +	if (msi_domain_prepare_irqs(domain, dev, hwsize, &bundle->alloc_info)) {

Does it work for MSI? hwsize is 1 in the MSI case, without taking
pci_msi_vec_count() into account.

bool pci_setup_msi_device_domain(struct pci_dev *pdev)
{
	[...]

	return pci_create_device_domain(pdev, &pci_msi_template, 1);

Thanks,
Zenghui



More information about the linux-arm-kernel mailing list