On Wed, Oct 2, 2019 at 3:35 AM JH <jupiter.hce@xxxxxxxxx> wrote: > > Hi, > > My understinding is that MTD manages the NAND bad blocks, but can the > MTD prevent bad blocks happening? > In short, No. > My iMX6 NAND device was only up and running about a month, it now > failed to boot from NAND due to the bad blocks: > > Questions: > > (a) what could be common cause to trigger bad blacks? NAND flash gets bit errors. It happens and there's no way to prevent it, you can only manage it. However, it's important to note that a bit-error != Bad Block. > (b) if I reflush the NAND will the bad blacks recovered or just mapped > it to bad block list? I assume by "reflush" you mean "reflash"? Not necessarily. You don't know what the problem is, therefore you don't know what will help. > > ....... > Bad block table found at page 131008, version 0x01 > Bad block table found at page 130944, version 0x01 For a system running with a BBT, this is a normal and good message. It doesn't indicate a problem - it is just telling you where the driver is keeping the BBT. > ................ > [FAILED] Failed to mount Kernel Debug File System. > [FAILED] Failed to mount Temporary Directory (/tmp). > [FAILED] Failed to start Remount Root and Kernel File Systems. > [FAILED] Failed to mount /var/volatile. > [FAILED] Failed to mount FUSE Control File System. > .............. These messages don't indicate anything useful at all. Your assertion is that you have developed "bad blocks". An assertion that can't be validated by the above messages. Hence an assumption. In fact, I don't believe it has anything to do with your flash at all, considering that most of those aren't physical file systems, but virtual ones that don't use the flash. In short - you may or may not have flash corruption issues, but the above messages don't tell us anything at all one way or the other. Even if you do have flash errors, you're more likely to have developed bit errors and your ECC is set at too low a threshold for your flash. Or you're not scrubbing properly. Or you didn't write the flash with proper ECC data. There's a large number of possible problems. Normal bit-rot is transient and manageable. Developing bad blocks is somewhat more rare. If you want a list of possible things to verify w/re: to the flash, I suggest you read this post: http://lists.infradead.org/pipermail/linux-mtd/2018-December/086331.html I suggest you give us the actual kernel log error messages so we can advise better. - Steve ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/