Re: [PATCH 2/2] ARM: i.MX: xload: consider ECC strength when reading page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Trent,

firstly thanks for your input. Please find my comments bellow.

On 7. 06. 21 22:03, Trent Piepho wrote:
On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@xxxxxxxxx> wrote:
Some NAND update tools/flashers do not take the full advantage of NAND's
entire page area for ECC purposes. For example, they might only use 2112
bytes of available 2176 bytes. In this case, ECC parameters have to be
read from the FCB table and taken into account in GPMI NAND xloader to
properly calculate page data length so DMA chain can be executed
correctly.

Tested on PHYTEC phyCARD i.MX6Q board with following NANDs:
- Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40)
- Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and
- Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80).

All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND
chip supporting ECC strength of 9 (29 bytes).

There is a bug in NXP's latest imx kernel, lf-5.10.y-1.0.0, that
results in the kernel driver incorrectly using the minimum ECC
specified in the ONFI nand specs instead of calculating a maximal ecc
value and using that, which is what prior kernels and the upstream
kernel use.  It was caused by incorrectly resolving a conflict when
they rebased one of their old patches to 5.10.

The common pagesize 0x800, oobsize 0x40 should use 8-bit ECC.  That's
what the uboot, barebox, and linux drivers would do since the first
mxs nand support years ago.  It's only the recent kernel bug in nxp's
kernel that will choose 4.

OK, I wasn't aware of this kernel bug, but this is not what we are trying to fix here. Our use-case for this, is migration from eboot (some old WinCE version) to barebox with some proprietary flasher tool. This tool uses NAND settings used by eboot, which are hardcoded to fixed pagesize of 0x800 bytes and oobsize of 0x40 bytes (8 ECC bits). If for example some other NAND is used with different page size (e.g. pagesize of 0x800 bytes with oobsize of 0x80 bytes) the BCH ECC page organization will only use 0x840 bytes.


So rather than switch to 4-bit, it would be better to fix these boards
to use 8-bit like they should.  More reliable ECC, and it will work
correctly on barebox, u-boot, old imx kernels, current upstream
kernels, and hopefully future imx kernels.

I agree that it would be better to use all of the space available, but if flasher used wrong settings to copy barebox binary to NAND these settings (although not optimal) should be used to make booting even possible.


Using the FCB data here might not be such a good idea.  While it seems
like the right thing, there are some issues:
The barebox main gpmi nand driver doesn't use the FCB
U-boot doesn't use the FCB
No Linux kernel uses the FCB

The main reason why I think we should use FCB here for this is because i.MX6's ROM already uses these values for booting into pre-bootloader. That's why we try to act in xloader like ROM does (reading NAND parameters from FCB). Nevertheless flasher tools should be responsible to match the BCH ECC page with what it is written into FCB. If that is not the case then we can only presume that the flasher used the optimal size for ECC.


If you try to read/write nand from any of those places, it won't work.
The only way to make it work, is to have the FCB match what those
drivers do.

In our case the described proprietary flasher tool only flashes barebox so only NAND pages with barebox binary are using not optimal ECC settings. If for example kernel, devicetree and rootfs would be flashed from barebox the NAND pages there would use correct ECC size and booting into linux and updating those NAND pages from linux works. Updating barebox from barebox itself (using barebox_update) would mean that the barebox binary will be overwritten in NAND with optimal ECC settings and FCB will be updated accordingly.


I think it would have been better if the original design had been for
the bootloader to read the FCB, use that to load the kernel, and then
fixup the ECC config into the device tree for the kernel to use too.
One source, the FCB, which is propagated to all users.  Everyone will
agree on the ECC and there are no independent settings to keep in
sync.

But they didn't do that.  Each driver figures it out on it's own and
hopefully they use matching algorithms that arrive at the same answer.
But of course this fails, like with nxp's lf-5.10.y-1.0.0 kernel.
This isn't the first time, this same type of bug appeared back in 2013
in 2febcdf84b and was fixed in 031e2777e.

So while your commit will allow these boards using poorly chosen FCB
values to work with the xloader, they will be corrupted if nand is
written to from barebox non-xload or from linux.


We are only using this ECC values to read barebox binary from NAND and copy it to RAM. If other NAND pages will be using different ECC values that doesn't break anything, I think. Only problem that I can see here is barebox or linux reading NAND pages occupied by barebox binary, this will most likely fail, but I don't see why that would be necessary anyway.

I don't think we are braking anything here, we are just fixing booting barebox from NAND whit not optimal ECC settings.

Please correct me if I'm wrong or if I'm missing something here?

BR,

Andrej

_______________________________________________
barebox mailing list
barebox@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/barebox



[Index of Archives]     [Linux Embedded]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux