Re: [RFC] Adding bootsource and reset-source* to /chosen

Quentin Schulz <quentin.schulz@xxxxxxxxx> · Fri, 17 Jan 2025 18:27:39 +0100

Hi Rob,

On 1/14/25 9:36 PM, Rob Herring wrote:
On Fri, Dec 13, 2024 at 12:08 PM Quentin Schulz
<quentin.schulz@xxxxxxxxx> wrote:

Hi all,

In U-Boot, there are multiple stages which can be loaded from different
media each. On Rockchip SoCs, it mostly goes this way, SoC BootROM loads
from a medium (storage or over USB) and executes the TPL (or SPL if no
TPL supported), the TPL goes back to BootROM and reads the SPL from the
same medium and executes it. The SPL will favor the same medium for
loading the U-Boot FIT image (containing TF-A BL31, OP-TEE OS BL32 and
U-Boot proper image), but in case it fails, falls back to other possible
media. In U-Boot we have the storage medium used for loading the U-Boot
FIT image stored in the DTB under /chosen/u-boot,spl-boot-device
property. But we do not have a property for the medium the BootROM
loaded the TPL (or SPL) from. This could be useful information. For
example, I have some CI where I want to check whether the BootROM loaded
the TPL from the expected location (important when checking that they
all work and don't depend on a bootloader flashed in a fallback medium
for example). Another usecase could be to detect whenever one loads the
TPL from USB (not a USB stick, some USB transfer; see rockusb for
Rockchip for example) and decide to do something else in U-Boot or the
OS depending on that (could be starting DFU for flashing the device, or
fastboot for loading the rest of the system).

Barebox currently does that by setting /chosen/bootsource to the full
path of the device the BootROM says it booted from, but that isn't part
of the DT spec so before I add support in U-Boot for that, I'm asking
whether that's fine or if we should correct course in Barebox and use
another name and/or location.

I don't see anything in your answer that would be applicable to 
/chosen/bootsource, am I correct in assuming that this part can go ahead 
and be submitted to the Device Tree schema/spec?

I've also been made aware of other "sources" that are added to the
/chosen node in Barebox, reset-source, reset-source-instance and
reset-source-device. I am not entirely sure what those exactly are for
BUT it seems that this is of interest to U-Boot as well since we do try
to report a reset cause as read from the CPU for Rockchip SoCs, c.f.
https://elixir.bootlin.com/u-boot/v2024.10/source/arch/arm/mach-rockchip/cpu-info.c#L14.
I believe /chosen/reset-source would be the reason, e.g. "POR", "RST",
"WDG", "WKE", "THERM", etc. /chosen/reset-source-device contains the
node path to the device that triggered that, that could be useful in the
event there are more than one thermal trip point (e.g. CPU or GPU), or
more than one hardware watchdog (I for example have a system with
potentially 4: SoC, PMIC, 2 separate in an external MCU). I guess
/chosen/reset-source-instance is for HW blocks that expose the same
reset but from two different sources which aren't modeled in Device
Tree. Again, this would be useful for the bootloader or OS to decide to
do something else based on the reset cause (e.g. have data on why some
devices reboot in the field, e.g. because of thermals, power loss,
brownout, etc..).

Is the handling of these properties something common/generic or going
to be device (e.g. wdog, pmic) specific. For the former, then /chosen
makes sense. For the latter, maybe everything being in the device
nodes makes more sense. Then you wouldn't need a link to the device.

I would imagine reset-source (the reason) to be device-specific. I'm 
sure there'll be some HW in the future with some odd reason to do a full 
HW reset or somehow support 10 different watchdogs in the same IP and we 
may want to know which one triggered it, I assume for that we wouldn't 
want WDG1 to WDG10 as possible reset-source? Though it seems that's what 
reset-source-instance would be for. Do we want to limit what the values 
of /chosen/reset-source could be in the schema? e.g. WDG for watchdog 
reset, THERM for temperature threshold exceeded, etc..

The issue I see with not having /chosen/reset-source-device but have 
some other property in the device node are two folds:

- in order to know which device triggered the reset, one will need to 
parse the whole tree for a specific property,
- I could imagine a scenario where a device not supported by the OS (or 
only auto-discoverable?) triggers a HW reset (e.g. from TF-A, OP-TEE, a 
sidecar microcontroller, etc...), I wouldn't be able to add the property 
to a node. I also concede that I don't know what kind of path would make 
sense in that property in that event, so maybe that's something we can 
just ignore for the moment, or is not realistic.

I'm reaching out to the devicetree-spec community before I add code in
U-Boot to implement this so we don't add unnecessary backward
compatibility code, were the properties or locations to be modified to
match the spec.

Does anyone disagree with adding this to the spec? Does anyone have
another suggestion for the naming or location of the properties?

I always feel better if we have multiple users for a new, common
binding. A new binding extended by a 2nd user shortly after defining
it makes me grumpy. Here's a thread that came up with the same type of
issue:

https://lore.kernel.org/all/20250113112349.801875-1-prabhakar.mahadev-lad.rj@xxxxxxxxxxxxxx/

I see...

The bootloader should be able to do something based on the reset source 
and reset device. Some HW devices do have clear on read status 
registers, so expecting the bootloader to not touch status registers is 
unrealistic in my opinion.

Also, as a user/distro I would need to try to get from all possible 
devices if any has a reason for the last system reset, across different 
subsystems I assume, so different APIs and all. This seems like a lot of 
work and error-prone.

The issue could be that they need the driver to do something specific if 
the device handled by that driver is the reason for the last system 
reset. If that's the case, then we could parse 
/chosen/reset-source-device and see it it's the same path as the 
device's and do something based on that and the other /chosen/reset-* 
properties. However, if there's a read-on-clear status register and we 
don't pass enough information through those properties, the device 
driver will not be able to handle things the way they want to.

Cheers,
Quentin