On Wed, Mar 27, 2024 at 04:31:29PM +0100, Miquel Raynal wrote: > Hi Christian, > > ansuelsmth@xxxxxxxxx wrote on Wed, 27 Mar 2024 15:36:54 +0100: > > > On Wed, Mar 27, 2024 at 03:26:55PM +0100, Rafał Miłecki wrote: > > > On 2024-03-22 05:09, Christian Marangi wrote: > > > > MTD OTP logic is very fragile and can be problematic with some specific > > > > kind of devices. > > > > > > > > NVMEM across the years had various iteration on how Cells could be > > > > declared in DT and MTD OTP probably was left behind and > > > > add_legacy_fixed_of_cells was enabled without thinking of the > > > > consequences. > > > > > > Er... thank you? > > > > > > > Probably made some bad assumption and sorry for it! > > Well, "not thinking about all consequences" seems always legitimate to > me, we are not robots. Anyway, I agree we should drop this sentence. > > > > > That option enables NVMEM to scan the provided of_node and treat each > > > > child as a NVMEM Cell, this was to support legacy NVMEM implementation > > > > and don't cause regression. > > > > > > > > This is problematic if we have devices like Nand where the OTP is > > > > triggered by setting a special mode in the flash. In this context real > > > > partitions declared in the Nand node are registered as OTP Cells and > > > > this cause probe fail with -EINVAL error. > > > > > > > > This was never notice due to the fact that till now, no Nand supported > > > > the OTP feature. With commit e87161321a40 ("mtd: rawnand: macronix: OTP > > > > access for MX30LFxG18AC") this changed and coincidentally this Nand is > > > > used on an FritzBox 7530 supported on OpenWrt. > > > > > > So as you noticed this problem was *exposed* by adding OTP support for > > > Macronix NAND chips. > > > > > > > > > > Alternative and more robust way to declare OTP Cells are already > > > > prossible by using the fixed-layout node or by declaring a child node > > > > with the compatible set to "otp-user" or "otp-factory". > > > > > > > > To fix this and limit any regression with other MTD that makes use of > > > > declaring OTP as direct child of the dev node, disable > > > > add_legacy_fixed_of_cells if we detect the MTD type is Nand. > > > > > > > > With the following logic, the OTP NVMEM entry is correctly created with > > > > no Cells and the MTD Nand is correctly probed and partitions are > > > > correctly exposed. > > > > > > > > Fixes: 2cc3b37f5b6d ("nvmem: add explicit config option to read old > > > > syntax fixed OF cells") > > > > > > It's not that commit however that introduced the problem. Introducing > > > "add_legacy_fixed_of_cells" just added a clean way of enabling parsing > > > of old cells syntax. Even before my commit NVMEM subsystem was looking > > > for NVMEM cells in NAND devices. > > > > > > I booted kernel 6.6 which has commit e87161321a40 ("mtd: rawnand: > > > macronix: OTP > access for MX30LFxG18AC") but does NOT have commit > > > 2cc3b37f5b6d ("nvmem: add explicit config option to read old syntax > > > fixed OF cells"). > > > > > > Look at this log from Broadcom Northstar (Linux 6.6): > > > [ 0.410107] nand: device found, Manufacturer ID: 0xc2, Chip ID: 0xdc > > > [ 0.416531] nand: Macronix MX30LF4G18AC > > > [ 0.420409] nand: 512 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB > > > size: 64 > > > [ 0.428022] iproc_nand 18028000.nand-controller: detected 512MiB total, > > > 128KiB blocks, 2KiB pages, 16B OOB, 8-bit, BCH-8 > > > [ 0.438991] Scanning device for bad blocks > > > [ 0.873598] Bad eraseblock 738 at 0x000005c40000 > > > [ 1.030279] random: crng init done > > > [ 1.854895] Bad eraseblock 2414 at 0x000012dc0000 > > > [ 2.657354] Bad eraseblock 3783 at 0x00001d8e0000 > > > [ 2.662967] Bad eraseblock 3785 at 0x00001d920000 > > > [ 2.848418] nvmem user-otp1: nvmem: invalid reg on > > > /nand-controller@18028000/nand@0 > > > [ 2.856126] iproc_nand 18028000.nand-controller: error -EINVAL: Failed to > > > register OTP NVMEM device > > > > > > So to summary it up: > > > 1. Problem exists since much earlier and wasn't introduced by 2cc3b37f5b6d > > > 2. Commit 2cc3b37f5b6d just gives you a clean way of solving this issue > > > 3. Problem was exposed by commit e87161321a40 > > > 4. We miss fix for v6.6 which doesn't have 2cc3b37f5b6d (it hit v6.7) > > > > > > > So the thing was broken all along? Maybe the regression was introduced > > when OF support for NVMEM cell was introduced? (and OF scan was enabled > > by default?) > > > > Anyway Sorry for adding the wrong fixes, maybe Miquel can remote the > > commit from mtd/fixes and fix the problematic fixes tag? > > Yes, please send a v4 (with the sentence above updated) and I will drop > v3. > Thanks a lot! I asked Rafal some suggestion for a better fixes tag and I will send v4. -- Ansuel