On Tue 09 Feb 08:27 CST 2021, Rob Herring wrote: > On Mon, Feb 8, 2021 at 5:10 PM Alexandre Belloni > <alexandre.belloni@xxxxxxxxxxx> wrote: > > > > On 08/02/2021 23:14:02+0100, Arnd Bergmann wrote: > > > On Mon, Feb 8, 2021 at 10:35 PM Alexandre Belloni > > > <alexandre.belloni@xxxxxxxxxxx> wrote: > > > > On 08/02/2021 20:52:37+0100, Arnd Bergmann wrote: > > > > > On Mon, Feb 8, 2021 at 7:42 PM Krzysztof Kozlowski <krzk@xxxxxxxxxx> wrote: > > > > > > Let me steer the discussion to original topic - it's about old kernel > > > > > > and new DTB, assuming that mainline kernel bisectability is not > > > > > > affected. > > > > > > > > > > > > Flow looks like this: > > > > > > > > > > > > 0. You have existing bidings and drivers. > > > > > > 1. Patch changing bindings (with new compatible) and drivers gets > > > > > > accepted by maintainer. > > > > > > 2. Patch above (bindings+drivers) goes during merge window to v5.11-rc1. > > > > > > 3. Patch changing in-tree DTS to new compatible gets accepted by > > > > > > maintainer and it is sent as v5.12-rc1 material to SoC maintainers. > > > > > > > > > > > > So again: old kernel, using old bindings, new DTB. > > > > > > > > > > > > > > I don't think forward compatibility was ever considered. I've seen it > > > > being mentioned a few times on #armlinux but honestly this simply can't > > > > be achieved. This would mean being able to write complete DT bindings > > > > for a particular SoC at day 0 which will realistically never happen. You > > > > may noteven have a complete datasheet and even if you have a datasheet, > > > > it may not be complete or it may be missing hw errata that are > > > > discovered later on and need a new binding to handle. > > > > > > You do not have to write the correct DT for this, the only requirement > > > is that any changes to a node are backward-compatible, which is > > > typically the case if you add properties or compatible strings without > > > removing the old one. A bugfix in this case is also backward-compatible. > > > > > > The part that can not happen instead is to write a DT that can expose > > > features that any future kernel will use. > > > > > > > But I think we are speaking about the other way around were you would be > > e.g. removing properties or splitting a node is multiple different > > nodes following a different understanding of the hardware. > > And in this case, any rework of the bindings will be forbidden, like > > 32b7cfbd4bb2 ("ARM: dts: at91: remove deprecated ADC properties") will > > break older kernels trying to use the new dtb. > > 761f6ed85417 ("ARM: dts: at91: sama5d4: use correct rtc compatible") is > > an other case. > > I'm not sure want to keep the older properties or the older compatible > > string as a fallback for this use case. > > > > > > > However, once the firmware is updated, it may no longer be possible to > > > > > go back to the old kernel in case the new one is busted. > > > > > > > > > > > > > Any serious update strategy will update both the kernel and device tree > > > > at the same time, exactly like you already have to update the initramfs > > > > with the kernel as soon as it is including kernel modules. > > > > I would expect any embedded platform to actually use a container format, > > > > like a FIT image that will ship the kernel, DT and intiramfs in a single > > > > image and will allow to sign all parts. > > > > > > Embedded systems that do this have no requirement for backward > > > or forward compatibility at all, the only requirement for these is bisectability > > > of git commits. > > > > > > > Yes and I can't see any drawbacks in this approach. > > > > > > > A similar problem can happen with the EBBR boot flow that relies on > > > > > a uefi-enabled firmware such as a u-boot, while using grub2 as the > > > > > actual boot loader. This is commonly supported across distros. While > > > > > grub2 can load a matching set of kernel+initrd+dtb from disk and run > > > > > that, this often fails in practice because u-boot needs to fill a > > > > > board specific set of DT properties (bootargs, detected memory, > > > > > mac address, ...). The usual way this gets handled is that u-boot loads > > > > > grub2 and the dtb from disk and then passes the modified dtb to grub, > > > > > which picks only kernel+initrd from disk and boots this with the dtb. > > > > > > > > > > The result is similar to case with dtb built into the firmware: after > > > > > upgrading the dtb that gets loaded by u-boot, grub can still pick > > > > > old kernels but they may not work as they did in the past. There are > > > > > obviously ways to work around it, but it does lead to user frustration. > > > > > > > > > > > > > Are there really any platforms with the dtb built into the firmware? > > > > I feel like this is a mythical creature used to scare people into keeping > > > > the DTB ABI stable. Aren't all the distribution already able to cope > > > > with keeping DTB and kernel in sync? > > > > > > I think most traditional PowerPC systems fall into this category, most > > > > My understanding was that the traditional PPC systems had a small device > > tree and usually are not affected by driver changes but I may be wrong. > > > > > systems that boot using UEFI+grub (as I explained), and anyone who > > > uses a distro kernel on custom hardware with their own dtb. > > > > > > > Aren't the ones using a distro kernel with a custom dtb more concerned > > by backward compatibility (i.e. new kernel with old dtb) rather than old > > kernel on new dtb? If they have an old dtb, an old kernel, and update to > > a new kernel, backward compatibility will ensure this continues to work. > > If then they work on updating their dtb, they still have the old one and > > can make the distribution match dtb and kernel. This is already handled > > properly by debian and I guess the other distributions as it is anyway > > already matching kernel and initramfs. > > SUSE is doing the opposite AIUI. This is a bit harder because adding > any new provider breaks compatibility as the old kernel will wait for > a non-existent driver for the new provider. That was the motivation > for deferred probe timeouts. Of course, I wouldn't really call a > platform stable if you are still adding clock, pinctrl, power-domain, > etc. providers. > IMHO "stable" in this context means that we've hit the point in development when these questions are no longer relevant. Either because the development is _done_ or more likely it's too old for anyone to care. Unfortunately this is the state that we're optimizing for and we're simply relying on luck to boot Linux on a reasonably complex machine. Regards, Bjorn