On Tue, Jun 8, 2021 at 12:23 AM Andrej Picej <andrej.picej@xxxxxxxxx> wrote: > On 7. 06. 21 22:03, Trent Piepho wrote: > > On Mon, Jun 7, 2021 at 2:32 AM Andrej Picej <andrej.picej@xxxxxxxxx> wrote: > >> - Samsung K9K8G08U0E (pagesize: 0x800, oobsize: 0x40) > >> - Winbond W29N08GVSIAA (pagesize: 0x800, oobsize: 0x40) and > >> - Spansion S34ML08G201FI00 (pagesize: 0x800, oobsize: 0x80). > >> > >> All NANDs having set ECC strength to 4 (13 bytes) despite Spansion NAND > >> chip supporting ECC strength of 9 (29 bytes). > tool uses NAND settings used by eboot, which are hardcoded to fixed > pagesize of 0x800 bytes and oobsize of 0x40 bytes (8 ECC bits). If for Ok, so 4 ecc bits was used for testing, but your actual use case is for flash that uses 8 bits when NAND has 128 OOB bytes, which the current code uses a value different than 8? My calculation is that 0x800+0x80 would use 18 bit ECC. But really, the exact numbers don't matter. Just that your nand flash tool, barebox xload, barebox main, uboot, uboot spl, linux, kobs-ng, etc. don't all agree on ECC values. > I agree that it would be better to use all of the space available, but > if flasher used wrong settings to copy barebox binary to NAND these > settings (although not optimal) should be used to make booting even > possible. But, how does one know 2nd stage barebox is flashed with the same ECC as 1st stage xload? See below. > > The main reason why I think we should use FCB here for this is because > i.MX6's ROM already uses these values for booting into pre-bootloader. > That's why we try to act in xloader like ROM does (reading NAND > parameters from FCB). Nevertheless flasher tools should be responsible > to match the BCH ECC page with what it is written into FCB. If that is I think it's fair to assume that the barebox xload is using the ECC from the FCB, otherwise it would not boot. But does barebox 2nd stage use same ECC as xload? In your case, the answer is currently yes. But is this always the case? I don't know of a specific board where it is not, but I do know this: It is common that a Linux based software update system will not update the bootloader. It might just do rootfs, or rootfs+kernel, but bootloader is less common. In two a stage system, xload + main, maybe the xload is not updated. It is a pain from Linux, with different versions of kobs and/or kobs-ng, which are poorly maintained and documented, a special attribute in sysfs that old Freescale kernels had and that isn't around anymore that is sometimes needed and sometimes not, etc. And as I have just discovered, iMX6UL and iMX6ULL use a different encoding of FCB that all other iMX and of course some kobs-ng versions don't know this and create a broken FCB. I even made a system that did this: barebox-xload had A/B support for 2nd stage and 2nd stage was updated, but the xload wasn't, since it wasn't fail-safe. But this was for CycloneV and doesn't apply here. So, suppose we have updated barebox 2nd stage from Linux (or barebox)? Now it uses "common" ECC values (IMHO, "optimal" is not an accurate term here) from Linux kernel. Barebox-xload current works to boot this, but your change will break that. It is a difficult problem, either choice of a ECC values could be the correct one. > In our case the described proprietary flasher tool only flashes barebox > so only NAND pages with barebox binary are using not optimal ECC > settings. If for example kernel, devicetree and rootfs would be flashed > from barebox the NAND pages there would use correct ECC size and booting > into linux and updating those NAND pages from linux works. Updating > barebox from barebox itself (using barebox_update) would mean that the > barebox binary will be overwritten in NAND with optimal ECC settings and > FCB will be updated accordingly. Does barebox_update run in 2nd stage barebox update both 2nd stage barebox and barebox-xload + FCB? Consider what happens if barebox 2nd stage is updated from Linux. Usually software update systems run on Linux, e.g. rauc or mender. In this case it will use Linux ECC settings, not FCB settings. You've got boards with barebox-xload and barebox using different ECC settings than kernel and rootfs. And not just two different settings, but also 2nd stage barebox and Linux don't know this. I predict this will be a source of much future pain. > We are only using this ECC values to read barebox binary from NAND and > copy it to RAM. If other NAND pages will be using different ECC values > that doesn't break anything, I think. Only problem that I can see here > is barebox or linux reading NAND pages occupied by barebox binary, this > will most likely fail, but I don't see why that would be necessary anyway. > > I don't think we are braking anything here, we are just fixing booting > barebox from NAND whit not optimal ECC settings. > > Please correct me if I'm wrong or if I'm missing something here? You've got ECC settings for: (xload barebox) (kernel rootfs) But if someone had this: (xload) (barebox kernel rootfs) Then it breaks. Why would they have that? As I describe above, everything in the 2nd set is updated from Linux using some software update system. Of course, the most common way is this: (xload barebox kernel rootfs) With just one set, when the xload has two choices, FCB vs common values, both are the same, so even if barebox is updated from Linux it still works. A solution that works for boths cases, but is also ugly and difficult, is to try both. If xload sees FCB values != calculated values, then just try both settings. One is virtually assured that the incorrect settings will produce massive numbers of errors from BCH. Read a couple pages and the settings which result in uncorrectable ECC errors on all pages are the wrong ones. _______________________________________________ barebox mailing list barebox@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/barebox