On Thu, Mar 24, 2022 at 05:21:50PM +0000, Marc Zyngier wrote: > On Thu, 24 Mar 2022 17:10:42 +0000, > Vladimir Oltean <vladimir.oltean@xxxxxxx> wrote: > > > > Hello Marc, > > > > On Tue, Dec 14, 2021 at 10:20:36AM +0000, Marc Zyngier wrote: > > > On Tue, 14 Dec 2021 09:58:54 +0000, > > > Vladimir Oltean <vladimir.oltean@xxxxxxx> wrote: > > > > > > > > Hi Marc (with a c), > > > > > > > > I wish the firmware for these SoCs was smart enough to be compatible > > > > with the bindings that are in the kernel and provide a blob that the > > > > kernel could actually use. Some work has been started there and this is > > > > work in progress. True, I don't know what other OF-based firmware some > > > > other customers may use, but I trust it isn't a lot more advanced than > > > > what U-Boot currently has :) > > > > > > > > Also, the machines may have been in the wild for years, but the > > > > ls-extirq driver was added in November 2019. So not with the > > > > introduction of the SoC device trees themselves. That isn't so long ago. > > > > > > > > As for compatibility between old kernel and new DT: I guess you'll hear > > > > various opinions on this one. > > > > https://www.spinics.net/lists/linux-mips/msg07778.html > > > > > > > > | > Are we okay with the new device tree blobs breaking the old kernel? > > > > | > > > > | From my point of view, newer device trees are not required to work on > > > > | older kernel, this would impose an unreasonable limitation and the use > > > > | case is very limited. > > > > > > My views are on the opposite side. DT is an ABI, full stop. If you > > > change something, you *must* guarantee forward *and* backward > > > compatibility. That's because: > > > > > > - you don't control how updatable the firmware is > > > > > > - people may need to revert to other versions of the kernel because > > > the new one is broken > > > > > > - there are plenty of DT users beyond Linux, and we are not creating > > > bindings for Linux only. > > > > > > You may disagree with this, but for the subsystems I maintain, this is > > > the rule I intent to stick to. > > > > > > M. > > > > > > -- > > > Without deviation from the norm, progress is not possible. > > > > I was just debugging an interesting issue with an old kernel not working > > with a new DT blob, and after figuring out what the problem was (is), > > I remembered this message and I'm curious what you have to say about it. > > > > I have this DT layout: > > > > ethernet-phy@1 { > > reg = <0x1>; > > interrupts-extended = <&extirq 2 IRQ_TYPE_LEVEL_LOW>; > > }; > > > > extirq: interrupt-controller@1ac { > > compatible = "fsl,ls1021a-extirq"; > > <bla bla> > > }; > > > > I booted the new DT blob (which has "interrupts-extended") on a kernel > > where the ls-extirq driver did not exist. This had the result of > > of_mdiobus_phy_device_register() -> of_irq_get() returning -EPROBE_DEFER > > forever and ever. So the PHY driver in turn never probed, and Ethernet > > was broken. So I had to delete the interrupts OF property to let the PHY > > at least work in poll mode. > > > > What went wrong here in your opinion? > > I'm not sure what you expect me to say here. You have a device that > references an interrupt. The DT seems sound (I don't get why you think > "interrupt-extended" is a problem here, but hey...). > > If your kernel doesn't have a driver for the interrupt controller > referenced here, what do you expect, other than things not working? > > M. > > -- > Without deviation from the norm, progress is not possible. I was just raising this as what I thought would be a simple and non-controversial counter example to your remark "If you change something, you *must* guarantee forward *and* backward compatibility." Practically speaking, what has happened is that the board DT appeared in kernel N, the ls-extirq driver in kernel N+1, and the DT was updated to enable PHY interrupts in kernel N+2. That DT update practically broke kernel N from running correctly on DTs taken from kernel N+2 onwards. This is the observable behavior, we can find as many justifications for it as we wish. (as to what I expect, Ethernet PHYs work without an interrupt too, but of_mdiobus_phy_device_register() treats -EPROBE_DEFER from of_irq_get() as special, because it assumes the IRQ domain will eventually come up. The IRQ is optional, as evidenced by the fact that kernel N worked)