On 12/15/22 08:16, Miquel Raynal wrote:
Hi Marek & Francesco,
Hi,
marex@xxxxxxx wrote on Mon, 5 Dec 2022 17:25:11 +0100:
On 12/5/22 14:49, Miquel Raynal wrote:
Hi Francesco,
Hi,
francesco@xxxxxxxxxx wrote on Mon, 5 Dec 2022 12:26:44 +0100:
On Fri, Dec 02, 2022 at 06:08:22PM +0100, Marek Vasut wrote:
But here I would say this is a firmware bug and it might have to be handled
like a firmware bug, i.e. with fixup in the partition parser. I seem to be
changing my opinion here again.
I was thinking at this over the weekend, and I came to the following
ideas:
- we need some improvement on the fixup we already have in the
partition parser. We cannot ignore the fdt produced by U-Boot - as
bad as it is.
- the proposed fixup is fine for the immediate need, but it is
not going to be enough to cover the general issue with the U-Boot
generated partitions. U-Boot might keep generating partitions as direct
child of the nand controller even when a partitions{} node is
available. In this case the current parser just fails since it looks
only into it and it will find it empty.
- the current U-Boot only handle partitions{} as a direct child of the
nand-controller, the nand-chip is ignored. This is not the way it is
supposed to work. U-Boot code would need to be improved.
I've been thinking about it this weekend as well and the current fix
which "just set" s_cell to 1 seems risky for me, it is typically the
type of quick & dirty fix that might even break other board (nobody
knew that U-Boot current logic expected #size-cells to be set in the
DT, what if another "broken" DT expects the opposite...)
Then with the current configuration, such broken DT would not work, since current DT does set #size-cells=<1> (wrongly).
, not
mentioning potential issues with big storages (> 4GiB).
All in all, I really think we should revert the DT change now, reverting
as little to no drawbacks besides a dt_binding_check warning and gives
us time to deal with it properly (both in U-Boot and Linux).
I am really not happy with this, but if that's marked as intermediate fix, go for it.
How do we deal with this in the long run however? Parser-side fix like this one, maybe with better heuristics ?
Yesterday while talking about an ACPI mis-description which needed
fixing, I realized fixing up what the firmware provides to Linux should
preferably be handled as early as possible. So my first first idea was
to avoid using the broken "fixup mtdparts" function in U-Boot and I am
still convinced this is what we should do in priority. However, as
rightly pointed in this thread, we need to take care about the case
where someone would use a newer DT (let's say, with the reverted changed
reverted again) with an old U-Boot. I am still against piggy hacks in
the generic ofpart.c driver, but what we could do however is a DT
fixup in the init_machine (or the dt_fixup) hook for imx7 Colibri, very
much like this:
https://elixir.bootlin.com/linux/latest/source/arch/arm/mach-mvebu/board-v7.c#L111
Plus a warning there saying "your dt is broken, update your firmware".
This does not work, because the old U-Boot fixup_mtdparts() may be
applied on any machine, it is not colibri mx7 specific. Also, new
arch-side workaround are really not welcome by the architecture
maintainers as far as I can tell.
So next time someone stumbles upon this issue, we can tell them "fix
your bootloader", and apply the same hack in their board family (there
are three or four IIRC which might be concerned some day).
There are also those machines we do not even know about which might be
generating bogus DT using old U-Boot and fixup_mtdparts(), so, unless
there is some all-arch fixup implementation, we wouldn't be able to fix
them all on arch side. I think the all-arch fixup implementation would
be the driver one, i.e. this patch as it is (or maybe with some
improvement).
That would fix all cases and only have an impact on the affected boards.
Sadly, it does only fix the known cases, not the unknown cases like
downstream forks which never get any bootloader updates ever, and which
you can't find in upstream U-Boot, and which you therefore cannot easily
catch in the arch side fixup.
[...]