[PATCH v4] of: property: Add fw_devlink support for interrupt-map property

Marc Zyngier maz at kernel.org
Tue May 28 09:36:14 PDT 2024


On Thu, 09 May 2024 13:08:20 +0100,
Anup Patel <apatel at ventanamicro.com> wrote:
> 
> Some of the PCI host controllers (such as generic PCI host controller)
> use "interrupt-map" DT property to describe the mapping between PCI
> endpoints and PCI interrupt pins. This is the only case where the
> interrupts are not described in DT.
> 
> Currently, there is no fw_devlink created based on "interrupt-map"
> DT property so interrupt controller is not guaranteed to be probed
> before the PCI host controller. This affects every platform where
> both PCI host controller and interrupt controllers are probed as
> regular platform devices.
> 
> This creates fw_devlink between consumers (PCI host controller) and
> supplier (interrupt controller) based on "interrupt-map" DT property.
> 
> Signed-off-by: Anup Patel <apatel at ventanamicro.com>
> Reviewed-by: Saravana Kannan <saravanak at google.com>
> ---
> Changes since v3:
> - Added a comment about of_irq_parse_raw()
> - Removed redundant NULL assignments to sup_args.np
> Changes since v2:
> - No need for a loop to find #interrupt-cells property value
> - Fix node de-reference leak when index is greater than number
>   of entries in interrupt-map property
> Changes since v1:
> - Updated commit description based on Rob's suggestion
> - Use of_irq_parse_raw() for parsing interrupt-map DT property

This patch breaks badly on my M1 Mini, with a continuous stream of
boot time warnings lasting about 100 seconds:

[   97.832335] ------------[ cut here ]------------
[   97.836955] /soc/pcie at 690000000/pci at 2,0 interrupt-map failed, using interrupt-controller
[   97.845072] WARNING: CPU: 0 PID: 1 at drivers/of/irq.c:277 of_irq_parse_raw+0x620/0x730
[   97.853087] Modules linked in:
[   97.856139] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W          6.10.0-rc1 #2915
[   97.864163] Hardware name: Apple Mac mini (M1, 2020) (DT)
[   97.869570] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[   97.876546] pc : of_irq_parse_raw+0x620/0x730
[   97.880907] lr : of_irq_parse_raw+0x620/0x730
[   97.885267] sp : ffffc000800538a0
[   97.888581] x29: ffffc000800538a0 x28: 0000000000000000 x27: ffffffffff5ffc68
[   97.895732] x26: ffffc000800539a4 x25: ffffffffff5ffbbc x24: ffffc00080053a48
[   97.902883] x23: ffffe3ff73284e68 x22: ffffb7889aede6d0 x21: ffffe3ff735f9320
[   97.910034] x20: ffffc00080053964 x19: ffffb7889aede6d0 x18: ffffffffffffffff
[   97.917185] x17: 7075727265746e69 x16: 20676e697375202c x15: 64656c6961662070
[   97.924336] x14: 616d2d7470757272 x13: 72656c6c6f72746e x12: 6f632d7470757272
[   97.931487] x11: 65746e6920676e69 x10: ffffe3ff740bcb68 x9 : ffffe3ff72335b38
[   97.938639] x8 : 00000001000028a4 x7 : ffffe3ff740afc08 x6 : 00000000000038a4
[   97.945789] x5 : 00000000000067b0 x4 : c0000001000028a4 x3 : 0000000000000000
[   97.952940] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffb784c16ade00
[   97.960092] Call trace:
[   97.962534]  of_irq_parse_raw+0x620/0x730
[   97.966545]  parse_interrupt_map+0xfc/0x188
[   97.970731]  of_fwnode_add_links+0x170/0x1e0
[   97.975004]  fw_devlink_parse_fwtree+0x44/0x98
[   97.979452]  fw_devlink_parse_fwtree+0x6c/0x98
[   97.983899]  fw_devlink_parse_fwtree+0x6c/0x98
[   97.988347]  device_add+0x610/0x6a8
[   97.991836]  of_device_add+0x4c/0x70
[   97.995411]  of_platform_device_create_pdata+0xa0/0x160
[   98.000644]  of_platform_bus_create+0x184/0x370
[   98.005178]  of_platform_populate+0x68/0x160
[   98.009451]  of_platform_default_populate_init+0xf4/0x118
[   98.014859]  do_one_initcall+0x4c/0x320
[   98.018695]  do_initcalls+0xf4/0x1d8
[   98.022271]  kernel_init_freeable+0x12c/0x280
[   98.026632]  kernel_init+0x2c/0x1f8
[   98.030120]  ret_from_fork+0x10/0x20
[   98.033696] ---[ end trace 0000000000000000 ]---

which comes from 10a20b34d735f ("of/irq: Don't ignore
interrupt-controller when interrupt-map failed").

Each of the 3 PCIe ports are described as such:

	port02: pci at 2,0 {
		device_type = "pci";
		reg = <0x1000 0x0 0x0 0x0 0x0>;
		reset-gpios = <&pinctrl_ap 33 GPIO_ACTIVE_LOW>;

		#address-cells = <3>;
		#size-cells = <2>;
		ranges;

		interrupt-controller;
		#interrupt-cells = <1>;

		interrupt-map-mask = <0 0 0 7>;
		interrupt-map = <0 0 0 1 &port02 0 0 0 0>,
				<0 0 0 2 &port02 0 0 0 1>,
				<0 0 0 3 &port02 0 0 0 2>,
				<0 0 0 4 &port02 0 0 0 3>;
		status = "disabled";
	};

and get probed *972 times*, which seem... excessive, given that there
are only 4 entries per port.

> ---
>  drivers/of/property.c | 52 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/drivers/of/property.c b/drivers/of/property.c
> index a6358ee99b74..2d749a18b037 100644
> --- a/drivers/of/property.c
> +++ b/drivers/of/property.c
> @@ -1311,6 +1311,57 @@ static struct device_node *parse_interrupts(struct device_node *np,
>  	return of_irq_parse_one(np, index, &sup_args) ? NULL : sup_args.np;
>  }
>  
> +static struct device_node *parse_interrupt_map(struct device_node *np,
> +					       const char *prop_name, int index)
> +{
> +	const __be32 *imap, *imap_end, *addr;
> +	struct of_phandle_args sup_args;
> +	u32 addrcells, intcells;
> +	int i, imaplen;
> +
> +	if (!IS_ENABLED(CONFIG_OF_IRQ))
> +		return NULL;
> +
> +	if (strcmp(prop_name, "interrupt-map"))
> +		return NULL;
> +
> +	if (of_property_read_u32(np, "#interrupt-cells", &intcells))
> +		return NULL;
> +	addrcells = of_bus_n_addr_cells(np);
> +
> +	imap = of_get_property(np, "interrupt-map", &imaplen);
> +	if (!imap || imaplen <= (addrcells + intcells))

This is "interesting". You compare a number of *bytes* with a number
of cells. Only off by a factor of 4...

Also, you need a minimum of one extra cell to hold the phandle, and a
yet unknown number of cells for whatever follows the phandle.

> +		return NULL;
> +	imap_end = imap + imaplen;

Same problem, with pointer arithmetic this time.

> +
> +	while (imap < imap_end) {
> +		addr = imap;
> +		imap += addrcells;
> +
> +		sup_args.np = np;
> +		sup_args.args_count = intcells;
> +		for (i = 0; i < intcells; i++)
> +			sup_args.args[i] = be32_to_cpu(imap[i]);
> +		imap += intcells;
> +
> +		/*
> +		 * Upon success, the function of_irq_parse_raw() returns
> +		 * interrupt controller DT node pointer in sup_args.np.
> +		 */
> +		if (of_irq_parse_raw(addr, &sup_args))
> +			return NULL;
> +
> +		if (!index)
> +			return sup_args.np;
> +
> +		of_node_put(sup_args.np);
> +		imap += sup_args.args_count + 1;

This really doesn't map (pun intended) to the way the interrupt-map
entries are built. You need to account for the parent address size as
well.

I'll post a patch fixing both issues.

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-riscv mailing list