>+Bean, > >Hi Thomas, > >First of all, I'd like to thank you for sharing this patch. I'm pretty sure this will >save days of painful debug sessions to a lot of people. > >On Thu, 29 Nov 2018 22:12:50 +0100 (CET) Thomas Gleixner ><tglx@xxxxxxxxxxxxx> wrote: > >> On some Micron NAND chips block erase fails occasionaly despite the >> chip claiming that it succeeded. The flash block seems to be not >> completely erased and subsequent usage of the block results in hard to >> decode and very subtle failures or corruption. >> >> The exact reason is unknown, but experimentation has shown that it is >> only happening when erasing an erase block which is partially written. >> Partially written erase blocks are not uncommon with UBI/UBIFS. Note, >> that this does not always happen. It's a rare and random, but eventually >fatal failure. >> >> For now, just blindly write 6 pages to 0. Again experimentation has >> shown that it's not sufficient to write pages at the beginning of the >> erase block. There need to be pages written in the second half of the >> erase block as well. So write 3 pages before and past the middle of the block. >> >> Less than 6 pages might be sufficient, but it might even be necessary >> to write more pages to make sure that it's completely cured. Two pages >> still failed, but the 6 held up in a stress test scenario. >> >> This should be optimized by keeping track of writes, but that needs >> proper information about the issue. >> >> As it's just observation and experimentation based, it's probably wise >> to hold off on this until there is proper clarification about the root >> cause of the problem. The patch is for reference so others can avoid >> to decode this again, but there is no guarantee that it actually fixes >> the issue completely. > >I agree. I Cc-ed Bean from Micron. Maybe he can provide more information >on this issue. > >> >> Therefore: >> >> Not-yet-signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx> >> >> Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxx> >> Cc: Miquel Raynal <miquel.raynal@xxxxxxxxxxx> >> Cc: Richard Weinberger <richard@xxxxxx> >> >> --- >> >> P.S.: This was debugged on an older kernel version (sigh) and ported >> forward without actual testing on mainline. My MTD foo is a bit >> rusty, so I won't be surprised if there are better ways to do that. > >Let's first wait for Bean's feedback before discussing implementation details. >BTW, do you remember the part number(s) of the flash(es) impacted by this >problem in your case? > Thanks, let me know this issue, I will look at this >Thanks, > >Boris > ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/