Hi Han, Han Xu <han.xu@xxxxxxx> wrote on Wed, 26 May 2021 09:17:00 -0500: > On 21/05/26 09:41AM, Miquel Raynal wrote: > > Hi Han, > > > > Han Xu <han.xu@xxxxxxx> wrote on Tue, 25 May 2021 14:13:08 -0500: > > > > > On 21/05/23 07:44PM, Sean Nyekjaer wrote: > > > > On 22/05/2021 22.51, Han Xu wrote: > > > > > The code change updates the gpmi driver bch geometry settings, the NAND > > > > > chips required minimum ecc strength and step size will be the default > > > > > value. It also proposes a new way to set bch geometry for large oob NAND > > > > > and provides an option to keep the legacy bch geometry setting for > > > > > backward compatibility. > > > > > > > > This will break all existing devicetree's. (this happened to us with the same style already merged u-boot patch) > > > > > > > > > > > > > > - For the legacy bch geometry settings > > > > > The driver uses legacy bch geometry settings if explicitly enabled it in > > > > > DT with fsl, legacy-bch-geometry flag, or the NAND chips are too old > > > > > that do NOT provide any required ecc info. > > > > > > > > NAND's are providing the minimum required ecc, the now legacy method > > > > actually gives more ecc bits and quite cheap when using hw ecc. > > > > > > > > > > > > > > The legacy_set_geometry() sets the data chunk size(step_size) larger > > > > > than oob size to make sure BBM locates in data chunk, then set the > > > > > maximum ecc stength oob can hold. It always use unbalanced ECC layout, > > > > > which ecc0 will cover both meta and data0 chunk. > > > > > > > > > > This algorithm can NOT provide strong enough ecc for some NAND chips, > > > > > such as some 8K+744 MLC NAND. We still leave it here just for backward > > > > > compatibility and als for some quite old version NAND chips support. It > > > > > should be en/disabled in both u-boot and kernel at the same time. > > > > > > > > > > - For the large oob bch geometry settings > > > > > This type of setting will be used for NAND chips, which oob size is > > > > > larger than 1024B OR NAND required step size is small than oob size, > > > > > the general idea is, > > > > > > > > > > 1.Try all ECC strength from the minimum value required by NAND chip > > > > > to the maximum one that works, any ECC makes the BBM locate in > > > > > data chunk can be eligible. > > > > > > > > > > 2.If none of them works, using separate ECC for meta, which will add > > > > > one extra ecc with the same ECC strength as other data chunks. > > > > > This extra ECC can guarantee BBM located in data chunk, also we > > > > > need to check if oob can afford it. > > > > > > > > > > - For all other common cases > > > > > set the bch geometry by chip required strength and step size, which uses > > > > > the minimum ecc strength chip required. > > > > > > > > > > Signed-off-by: Han Xu <han.xu@xxxxxxx> > > > > > > > > One further point, u-boot older than v2020.04 will not be aligned with the way ecc bits is > > > > calculated with this patch applied(without the legacy option set). > > > > > > > > It's quite a mess :/ > > > > I would recommend to use the legacy mode as default, and add the new style as "modern" option. > > > > (Also in u-boot) > > > > > > > > /Sean > > > > > > > > > Hi Sean, > > > > > > I know this change brings mess but the legacy way is wrong. The way it determine > > > the data chunk size is arbitrary, > > > > Yes, that's always the case: all default choices are arbitrary in the > > Linux kernel, there is actually a lot of logic provided by the core to > > handle that, unless the user requires something specific. > > > > > take any 8K + 448 (not 744, typo in previous > > > comments) Micron NAND chips as example, the legacy way can only provide 16bit > > > ecc since data chunk size is set to 512B, but 24b/1K is required by NAND chips. > > > > This means a strength of 32 bits per kilobyte of data vs 24 bits per > > kilobyte. > > > > > According to the A/B X/Y ecc requirement, this is not acceptable. > > > > What you call the legacy way is even better, the only situation where > > it may be an issue is if the NAND chip does not feature enough OOB > > bytes to store the ECC codes, which does not seem to be your primary > > concern here. > > Hi Miquel, > > The legacy ecc strength is fine or even better by average, but it doesn't meet > the requirement #2 > > (1) A / B >= X / Y > (2) A >= X > > that's my primary concern. I understand that (2) might be ideal to meet but is breaking all the boards that use this driver really worth the trouble? Short answer: no. So we need to adapt the calculation for new boards/new flash chips/certain geometries at most. > > > The new implementation might get weak ecc than legacy way in some cases but it > > > is safety guaranteed. > > > > What does "safety guaranteed" means? > > set minimum ecc required by nand chip at least meet all requirements > > > > > > This reminds me the gpmi raw access mode changes in kernel 3.19, it also changes > > > the driver behaviors and makes totally different output compared with older > > > versions. I know changes bring mess but we have to accept it at some point > > > rather than keep compromising to the wrong way. > > > > How is this an argument? I am usually in favor of moving forward when > > there is a real justification, but this does not seem the case, unless > > I am understanding it all the wrong way. > > > > > The change has been in NXP kernel fork for a while, so quite a few customers are > > > using this bch geometry settings. I hope it can be upstreamed, any other things > > > I can do may mitigate the imapact? > > > > You are well aware of the upstreaming process, trying to merge > > something locally, making it used and then complaining because not > > upstreaming it would break your customers really is your own > > responsibility. > > Sorry I understand I should try upstreaming it early, so I am still looking for > a chance to avoid further divergence. > > > > > IMHO the solutions are: > > - the current (mainline) default will remain the standard for > > geometries which are already widely supported > > - if there are new geometries that must be supported and do not fit > > because of the "legacy" logic, then you may detect that and try > > to fallback to the "modern" way of calculating the ECC > > parameters (or even jump directly to the modern way if the geometry > > really is not currently supported officially) > > - if your customers want a specific chunk size/strength when > > rebasing on top of a mainline kernel there are DT properties which do > > that anyway > > - follow Sean advice: introduce a property requesting to use the > > 'modern' or 'legacy' logic (with a better name than modern) but first > > check with Rob that this if valid. Another hint: please check the core helpers and use them instead of trying to re-invent the wheel: normally just describing the engine capabilities and calling a single helper should do the trick. But this 'new' calculation should only apply to eg. MLC devices or devices with specific geometries, not to all devices. Thanks, Miquèl