Hi Bean, "Bean Huo (beanhuo)" <beanhuo@xxxxxxxxxx> wrote on Fri, 7 Dec 2018 13:12:56 +0000: > >+Bean, > > > >Hi Thomas, > > > >First of all, I'd like to thank you for sharing this patch. I'm pretty sure this will > >save days of painful debug sessions to a lot of people. > > > >On Thu, 29 Nov 2018 22:12:50 +0100 (CET) Thomas Gleixner > ><tglx@xxxxxxxxxxxxx> wrote: > > > >> On some Micron NAND chips block erase fails occasionaly despite the > >> chip claiming that it succeeded. The flash block seems to be not > >> completely erased and subsequent usage of the block results in hard to > >> decode and very subtle failures or corruption. > >> > >> The exact reason is unknown, but experimentation has shown that it is > >> only happening when erasing an erase block which is partially written. > >> Partially written erase blocks are not uncommon with UBI/UBIFS. Note, > >> that this does not always happen. It's a rare and random, but eventually > >fatal failure. > >> > >> For now, just blindly write 6 pages to 0. Again experimentation has > >> shown that it's not sufficient to write pages at the beginning of the > >> erase block. There need to be pages written in the second half of the > >> erase block as well. So write 3 pages before and past the middle of the block. > >> > >> Less than 6 pages might be sufficient, but it might even be necessary > >> to write more pages to make sure that it's completely cured. Two pages > >> still failed, but the 6 held up in a stress test scenario. > >> > >> This should be optimized by keeping track of writes, but that needs > >> proper information about the issue. > >> > >> As it's just observation and experimentation based, it's probably wise > >> to hold off on this until there is proper clarification about the root > >> cause of the problem. The patch is for reference so others can avoid > >> to decode this again, but there is no guarantee that it actually fixes > >> the issue completely. > > > >I agree. I Cc-ed Bean from Micron. Maybe he can provide more information > >on this issue. > > > >> > >> Therefore: > >> > >> Not-yet-signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > >> > >> Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxx> > >> Cc: Miquel Raynal <miquel.raynal@xxxxxxxxxxx> > >> Cc: Richard Weinberger <richard@xxxxxx> > >> > >> --- > >> > >> P.S.: This was debugged on an older kernel version (sigh) and ported > >> forward without actual testing on mainline. My MTD foo is a bit > >> rusty, so I won't be surprised if there are better ways to do that. > > > >Let's first wait for Bean's feedback before discussing implementation details. > >BTW, do you remember the part number(s) of the flash(es) impacted by this > >problem in your case? > > > Thanks, let me know this issue, I will look at this I think it's time for you to comment on the situation. Thanks, Miquèl ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/