[RFC PATCH devicetree 00/10] Do something about ls-extirq interrupt-map breakage

Marc Zyngier maz at kernel.org
Thu Mar 24 11:06:51 PDT 2022


On Thu, 24 Mar 2022 17:34:06 +0000,
Vladimir Oltean <vladimir.oltean at nxp.com> wrote:
> 
> On Thu, Mar 24, 2022 at 05:21:50PM +0000, Marc Zyngier wrote:
> > On Thu, 24 Mar 2022 17:10:42 +0000,
> > Vladimir Oltean <vladimir.oltean at nxp.com> wrote:
> > > 
> > > Hello Marc,
> > > 
> > > On Tue, Dec 14, 2021 at 10:20:36AM +0000, Marc Zyngier wrote:
> > > > On Tue, 14 Dec 2021 09:58:54 +0000,
> > > > Vladimir Oltean <vladimir.oltean at nxp.com> wrote:
> > > > > 
> > > > > Hi Marc (with a c),
> > > > > 
> > > > > I wish the firmware for these SoCs was smart enough to be compatible
> > > > > with the bindings that are in the kernel and provide a blob that the
> > > > > kernel could actually use. Some work has been started there and this is
> > > > > work in progress. True, I don't know what other OF-based firmware some
> > > > > other customers may use, but I trust it isn't a lot more advanced than
> > > > > what U-Boot currently has :)
> > > > > 
> > > > > Also, the machines may have been in the wild for years, but the
> > > > > ls-extirq driver was added in November 2019. So not with the
> > > > > introduction of the SoC device trees themselves. That isn't so long ago.
> > > > > 
> > > > > As for compatibility between old kernel and new DT: I guess you'll hear
> > > > > various opinions on this one.
> > > > > https://www.spinics.net/lists/linux-mips/msg07778.html
> > > > > 
> > > > > | > Are we okay with the new device tree blobs breaking the old kernel?
> > > > > |
> > > > > | From my point of view, newer device trees are not required to work on
> > > > > | older kernel, this would impose an unreasonable limitation and the use
> > > > > | case is very limited.
> > > > 
> > > > My views are on the opposite side. DT is an ABI, full stop. If you
> > > > change something, you *must* guarantee forward *and* backward
> > > > compatibility. That's because:
> > > > 
> > > > - you don't control how updatable the firmware is
> > > > 
> > > > - people may need to revert to other versions of the kernel because
> > > >   the new one is broken
> > > > 
> > > > - there are plenty of DT users beyond Linux, and we are not creating
> > > >   bindings for Linux only.
> > > > 
> > > > You may disagree with this, but for the subsystems I maintain, this is
> > > > the rule I intent to stick to.
> > > > 
> > > > 	M.
> > > > 
> > > > -- 
> > > > Without deviation from the norm, progress is not possible.
> > > 
> > > I was just debugging an interesting issue with an old kernel not working
> > > with a new DT blob, and after figuring out what the problem was (is),
> > > I remembered this message and I'm curious what you have to say about it.
> > > 
> > > I have this DT layout:
> > > 
> > > 	ethernet-phy at 1 {
> > > 		reg = <0x1>;
> > > 		interrupts-extended = <&extirq 2 IRQ_TYPE_LEVEL_LOW>;
> > > 	};
> > > 
> > > 	extirq: interrupt-controller at 1ac {
> > > 		compatible = "fsl,ls1021a-extirq";
> > > 		<bla bla>
> > > 	};
> > > 
> > > I booted the new DT blob (which has "interrupts-extended") on a kernel
> > > where the ls-extirq driver did not exist. This had the result of
> > > of_mdiobus_phy_device_register() -> of_irq_get() returning -EPROBE_DEFER
> > > forever and ever. So the PHY driver in turn never probed, and Ethernet
> > > was broken. So I had to delete the interrupts OF property to let the PHY
> > > at least work in poll mode.
> > > 
> > > What went wrong here in your opinion?
> > 
> > I'm not sure what you expect me to say here. You have a device that
> > references an interrupt. The DT seems sound (I don't get why you think
> > "interrupt-extended" is a problem here, but hey...).
> > 
> > If your kernel doesn't have a driver for the interrupt controller
> > referenced here, what do you expect, other than things not working?
> > 
> > 	M.
> > 
> > -- 
> > Without deviation from the norm, progress is not possible.
> 
> I was just raising this as what I thought would be a simple and
> non-controversial counter example to your remark "If you change something,
> you *must* guarantee forward *and* backward compatibility."

If you change something *in the binding*, which was implicit in the
context, and makes no sense out of context.

> 
> Practically speaking, what has happened is that the board DT appeared in
> kernel N, the ls-extirq driver in kernel N+1, and the DT was updated to
> enable PHY interrupts in kernel N+2. That DT update practically broke
> kernel N from running correctly on DTs taken from kernel N+2 onwards.
> This is the observable behavior, we can find as many justifications for
> it as we wish.

Well, you can also argue that the DT was broken at N and N+1 for not
describing the HW correctly and completely. No binding has changed
here. Your DT was incomplete, and someone fixed it for you.

We can argue this things forever and a half. I've laid down the ground
rules for the stuff I maintain. If you're not happy with this, you can
fix it by either removing the NXP hardware from the tree, or taking
over from me as the irqchip maintainer. I'd be perfectly happy with
any (and even more, with both) of these outcomes.

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list