[RFC 1/2] arm64: PCI: Allow use arch-specific pci sysdata

Bjorn Helgaas helgaas at kernel.org
Fri Mar 19 21:12:46 GMT 2021


[+cc Arnd (author of 37d6a0a6f470 ("PCI: Add
pci_register_host_bridge() interface"), which I think would make my
idea below possible), Marc (IRQ domains maintainer)]

On Sat, Mar 20, 2021 at 12:19:55AM +0800, Boqun Feng wrote:
> Currently, if an architecture selects CONFIG_PCI_DOMAINS_GENERIC, the
> ->sysdata in bus and bridge will be treated as struct pci_config_window,
> which is created by generic ECAM using the data from acpi.

It might be a mistake that we put the struct pci_config_window
pointer, which is really arch-independent, in the ->sysdata element,
which normally contains a pointer to arch- or host bridge-dependent 
data.

> However, for a virtualized PCI bus, there might be no enough data in of
> or acpi table to create a pci_config_window. This is similar to the case
> where CONFIG_PCI_DOMAINS_GENERIC=n, IOW, architectures use their own
> structure for sysdata, so no apci table lookup is required.
> 
> In order to enable Hyper-V's virtual PCI (which doesn't have acpi table
> entry for PCI) on ARM64 (which selects CONFIG_PCI_DOMAINS_GENERIC), we
> introduce arch-specific pci sysdata (similar to the one for x86) for
> ARM64, and allow the core PCI code to detect the type of sysdata at the
> runtime. The latter is achieved by adding a pci_ops::use_arch_sysdata
> field.
> 
> Originally-by: Sunil Muthuswamy <sunilmut at microsoft.com>
> Signed-off-by: Boqun Feng (Microsoft) <boqun.feng at gmail.com>
> ---
>  arch/arm64/include/asm/pci.h | 29 +++++++++++++++++++++++++++++
>  arch/arm64/kernel/pci.c      | 15 ++++++++++++---
>  include/linux/pci.h          |  3 +++
>  3 files changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..dade061a0658 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -22,6 +22,16 @@
>  
>  extern int isa_dma_bridge_buggy;
>  
> +struct pci_sysdata {
> +	int domain;	/* PCI domain */
> +	int node;	/* NUMA Node */
> +#ifdef CONFIG_ACPI
> +	struct acpi_device *companion;	/* ACPI companion device */
> +#endif
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +	void *fwnode;			/* IRQ domain for MSI assignment */
> +#endif
> +};

Our PCI domain code is really a mess (mostly my fault) and I hate to
make it even more complicated by adding more switches, e.g.,
->use_arch_sysdata.

I think the design problem is that PCI host bridge drivers should
supply the PCI domain up front instead of having callbacks to extract
it.

We could put "int domain_nr" in struct pci_host_bridge, and the arch
code or host bridge driver (pcibios_init_hw(), *_pcie_probe(), VMD,
HV, etc) could fill in pci_host_bridge.domain_nr before calling
pci_scan_root_bus_bridge() or pci_host_probe().

Then maybe we could get rid of pci_bus_find_domain_nr() and some of
the needlessly arch-specific implementations of pci_domain_nr().
I think we likely could get rid of CONFIG_PCI_DOMAINS_GENERIC, too,
eventually.

>  #ifdef CONFIG_PCI
>  static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  {
> @@ -31,8 +41,27 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
>  
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
> +	if (bus->ops->use_arch_sysdata)
> +		return pci_domain_nr(bus);
>  	return 1;

I don't understand this.  pci_proc_domain() returns a boolean and
determines whether the /proc/bus/pci/ directory contains, e.g.,

  /proc/bus/pci/00            or
  /proc/bus/pci/0000:00

On arm64, pci_proc_domain() currently always returns 1, so the
directory contains "0000:00".  After these patches, pci_proc_domain()
returns 0 if CONFIG_PCI_DOMAINS_GENERIC=y and "bus" is in domain 0,
so buses in domain 0 will be "00" instead of "0000:00".

This doesn't make sense to me, but at the very least, this
user-visible change needs to be explained.

>  }
> +#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
> +static inline void *_pci_root_bus_fwnode(struct pci_bus *bus)
> +{
> +	struct pci_sysdata *sd = bus->sysdata;
> +
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->fwnode;
> +
> +	/*
> +	 * bus->sysdata is not struct pci_sysdata, fwnode should be able to
> +	 * be queried from of/acpi.
> +	 */
> +	return NULL;
> +}
> +#define pci_root_bus_fwnode	_pci_root_bus_fwnode

Ugh.  pci_root_bus_fwnode() is another callback to find the
irq_domain.  Only one call, from pci_host_bridge_msi_domain(), which
itself is only called from pci_set_bus_msi_domain().  This feels like
another case where we could simplify things by having the host bridge
driver figure out the irq_domain explicitly when it creates the
pci_host_bridge.  It seems like that's where we have the most
information about how to find the irq_domain.

> +#endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */
> +
>  #endif  /* CONFIG_PCI */
>  
>  #endif  /* __ASM_PCI_H */
> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c
> index 1006ed2d7c60..63d420d57e63 100644
> --- a/arch/arm64/kernel/pci.c
> +++ b/arch/arm64/kernel/pci.c
> @@ -74,15 +74,24 @@ struct acpi_pci_generic_root_info {
>  int acpi_pci_bus_find_domain_nr(struct pci_bus *bus)
>  {
>  	struct pci_config_window *cfg = bus->sysdata;
> -	struct acpi_device *adev = to_acpi_device(cfg->parent);
> -	struct acpi_pci_root *root = acpi_driver_data(adev);
> +	struct pci_sysdata *sd = bus->sysdata;
> +	struct acpi_device *adev;
> +	struct acpi_pci_root *root;
> +
> +	/* struct pci_sysdata has domain nr in it */
> +	if (bus->ops->use_arch_sysdata)
> +		return sd->domain;
> +
> +	/* or pci_config_window is used as sysdata */
> +	adev = to_acpi_device(cfg->parent);
> +	root = acpi_driver_data(adev);

My comments above are a lot of hand-waving without a very clear way
forward.  Would it simplify things to just add a "struct
pci_config_window *ecam_info" to pci_host_bridge, so we wouldn't have
to overload sysdata?

>  	return root->segment;
>  }
>  
>  int pcibios_root_bridge_prepare(struct pci_host_bridge *bridge)
>  {
> -	if (!acpi_disabled) {
> +	if (!acpi_disabled && bridge->ops->use_arch_sysdata) {
>  		struct pci_config_window *cfg = bridge->bus->sysdata;
>  		struct acpi_device *adev = to_acpi_device(cfg->parent);
>  		struct device *bus_dev = &bridge->bus->dev;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 86c799c97b77..4036aac40361 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -740,6 +740,9 @@ struct pci_ops {
>  	void __iomem *(*map_bus)(struct pci_bus *bus, unsigned int devfn, int where);
>  	int (*read)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 *val);
>  	int (*write)(struct pci_bus *bus, unsigned int devfn, int where, int size, u32 val);
> +#ifdef CONFIG_PCI_DOMAINS_GENERIC
> +	int use_arch_sysdata;	/* ->sysdata is arch-specific */
> +#endif
>  };
>  
>  /*
> -- 
> 2.30.2
> 



More information about the linux-arm-kernel mailing list