Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8. 06. 21 14:38, Trent Piepho wrote:
On Tue, Jun 8, 2021 at 12:23 AM Andrej Picej <andrej.picej@xxxxxxxxx> wrote:
On 7. 06. 21 22:03, Trent Piepho wrote:
On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@xxxxxxxxx> wrote:
- Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40)
- Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and
- Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80).

All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND
chip supporting ECC strength of 9 (29 bytes).

tool uses NAND settings used by eboot, which are hardcoded to fixed
pagesize of 0x800 bytes and oobsize of 0x40 bytes (8 ECC bits). If for

Ok, so 4 ecc bits was used for testing, but your actual use case is
for flash that uses 8 bits when NAND has 128 OOB bytes, which the
current code uses a value different than 8?  My calculation is that
0x800+0x80 would use 18 bit ECC.

Actually 8 ECC bits was used for testing. Maybe it was wrong that I named EccBlockNEccType (from i.MX 6Dual/6Quad Applications Processor Reference Manual) as ECC strength (in commit message) as it gets shifted to the left for one bit to get ECC size in bits. So yes, we agree, 8 bit ECC for 0x800+0x80 (4<<1 = 8) and 18 bit ECC for 0x800+0x80 (9<<1 = 18).


But really, the exact numbers don't matter.  Just that your nand flash
tool, barebox xload, barebox main, uboot, uboot spl, linux, kobs-ng,
etc. don't all agree on ECC values.

I agree that it would be better to use all of the space available, but
if flasher used wrong settings to copy barebox binary to NAND these
settings (although not optimal) should be used to make booting even
possible.

But, how does one know 2nd stage barebox is flashed with the same ECC
as 1st stage xload?  See below.


The main reason why I think we should use FCB here for this is because
i.MX6's ROM already uses these values for booting into pre-bootloader.
That's why we try to act in xloader like ROM does (reading NAND
parameters from FCB). Nevertheless flasher tools should be responsible
to match the BCH ECC page with what it is written into FCB. If that is

I think it's fair to assume that the barebox xload is using the ECC
from the FCB, otherwise it would not boot.  But does barebox 2nd stage
use same ECC as xload?  In your case, the answer is currently yes.
But is this always the case?

I don't know of a specific board where it is not, but I do know this:
It is common that a Linux based software update system will not update
the bootloader.  It might just do rootfs, or rootfs+kernel, but
bootloader is less common.  In two a stage system, xload + main, maybe
the xload is not updated.  It is a pain from Linux, with different
versions of kobs and/or kobs-ng, which are poorly maintained and
documented, a special attribute in sysfs that old Freescale kernels
had and that isn't around anymore that is sometimes needed and
sometimes not, etc.  And as I have just discovered, iMX6UL and iMX6ULL
use a different encoding of FCB that all other iMX and of course some
kobs-ng versions don't know this and create a broken FCB.

I even made a system that did this: barebox-xload had A/B support for
2nd stage and 2nd stage was updated, but the xload wasn't, since it
wasn't fail-safe.  But this was for CycloneV and doesn't apply here.

So, suppose we have updated barebox 2nd stage from Linux (or barebox)?
  Now it uses "common" ECC values (IMHO, "optimal" is not an accurate
term here) from Linux kernel.  Barebox-xload current works to boot
this, but your change will break that.

OK, I see. This is a valid point. Didn't really understand that updating only 2nd stage barebox is a common practice. Do you know of any imx6 board that does that, because this xloader is imx6 specific?


It is a difficult problem, either choice of a ECC values could be the
correct one.

Yes I agree, either way we break booting in one of our use cases. In my case pre-bootloader wouldn't get correctly read and in your case main bootloader wouldn't get correctly read.


In our case the described proprietary flasher tool only flashes barebox
so only NAND pages with barebox binary are using not optimal ECC
settings. If for example kernel, devicetree and rootfs would be flashed
from barebox the NAND pages there would use correct ECC size and booting
into linux and updating those NAND pages from linux works. Updating
barebox from barebox itself (using barebox_update) would mean that the
barebox binary will be overwritten in NAND with optimal ECC settings and
FCB will be updated accordingly.

Does barebox_update run in 2nd stage barebox update both 2nd stage
barebox and barebox-xload + FCB?

Yes, it does.


Consider what happens if barebox 2nd stage is updated from Linux.
Usually software update systems run on Linux, e.g. rauc or mender.  In
this case it will use Linux ECC settings, not FCB settings.

You've got boards with barebox-xload and barebox using different ECC
settings than kernel and rootfs.  And not just two different settings,
but also 2nd stage barebox and Linux don't know this.  I predict this
will be a source of much future pain.


We are only using this ECC values to read barebox binary from NAND and
copy it to RAM. If other NAND pages will be using different ECC values
that doesn't break anything, I think. Only problem that I can see here
is barebox or linux reading NAND pages occupied by barebox binary, this
will most likely fail, but I don't see why that would be necessary anyway.

I don't think we are braking anything here, we are just fixing booting
barebox from NAND whit not optimal ECC settings.

Please correct me if I'm wrong or if I'm missing something here?

You've got ECC settings for:
(xload barebox) (kernel rootfs)
But if someone had this:
(xload) (barebox kernel rootfs)
Then it breaks.

Yes I agree, as i already wrote above, I didn't know this is common way of doing bootloader update.


Why would they have that?  As I describe above, everything in the 2nd
set is updated from Linux using some software update system.

Of course, the most common way is this:
(xload barebox kernel rootfs)

With just one set, when the xload has two choices, FCB vs common
values, both are the same, so even if barebox is updated from Linux it
still works.

A solution that works for boths cases, but is also ugly and difficult,
is to try both.  If xload sees FCB values != calculated values, then
just try both settings.  One is virtually assured that the incorrect
settings will produce massive numbers of errors from BCH.  Read a
couple pages and the settings which result in uncorrectable ECC errors
on all pages are the wrong ones.


Yes that would be an ugly fix for this.

But I see one problem. If different ECC values are used for pre-bootloader and main bootloader (like it is the case in example that you provided) we would have to read pre-bootloader and main bootloader with different ECC settings.

So the xload would look something like:
- read a couple of pages from pre-bootloader and select appropriate "readtotal_pbl"
- copy pre-bootloader to RAM with selected "readtotal_pbl"
- read a couple of pages from main-bootloader and select appropriate "readtotal_main" - copy the remaining pages (main barebox) to RAM with selected "readtotal_main"

Now for this we would need to find out where PBL ends and main barebox starts (probably from boot data?).

This would solve all of the problems right?

But is this all needed for such extreme use case?

As I said, I don't know how common it is for user to update only 2nd stage barebox, and how common it is to use flasher tools which would use different ECC settings than barebox and kernel for example. Both of these are needed to get ECC mismatch. And I can't think of other cases where a mismatch between ECC settings between pre-bootloader and 2nd stage barebox would happen.

BR,
Andrej



_______________________________________________
barebox mailing list
barebox@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/barebox



[Index of Archives]     [Linux Embedded]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux