Hi Zhou, On Thu, Jan 22, 2015 at 11:27:01AM +0800, Zhou Wang wrote: > Very sorry for late, I made tests again and also had a talk with the > NAND controller hardware colleague. Please find my reply below. No problem. Glad to hear you followed through on this one, as the results were curious. > On 2015/1/13 12:17, Brian Norris wrote: > > On Wed, Dec 17, 2014 at 07:05:47PM +0800, Zhou Wang wrote: > >> On 2014年12月17日 14:23, Brian Norris wrote: > > [...] > >>>> [ 104.648056] mtd_nandbiterrs: ECC failure, read data is incorrect > >>>> despite read success > >>>> insmod: can't insert 'mtd_nandbiterrs.ko': Input/output error [...] > I made testes again in 1bit/ECC and 16bit/ECC modes using 2K(page)+64B(oob) > NAND flash. here are the logs, I also printed ECC code in OOB area. > > Results are: > 1. in 16bit/ECC, it will return -EBADMSG as the ECC codes have been broken. > 2. in 1bit/ECC, it will not reture -EBADMSG because a hardware design problem. > I will explain the detail below. > > Test logs: > 1. in 16bit/ECC(print ECC codes): > > /home # insmod mtd_nandbiterrs.ko dev=2 page_offset=1 seed=110 mode=0 ... > mtd_nandbiterrs: error: read failed at 0x800 > mtd_nandbiterrs: After 1 biterrors per subpage, read reported error -74 ^^^ Ah, that's what I would expect from a driver that doesn't implement the raw() functions. > mtd_nandbiterrs: finished successfully. > ================================================== > insmod: can't insert 'mtd_nandbiterrs.ko': Input/output error > > 2. in 1bit/ECC(print ECC codes): > /home # insmod mtd_nandbiterrs.ko dev=2 page_offset=1 seed=110 mode=0 ... > mtd_nandbiterrs: ECC failure, read data is incorrect despite read success > insmod: can't insert 'mtd_nandbiterrs.ko': Input/output error > > Reason about above 1bit/ECC test result: ... > It can not correct this kind of 2bit errors in 1bit/ECC mode in this NAND > controller, however, it will trigger a correctable interrupt. As a result, > software can not find this 1bit error in page data. IOW, uncorrectable errors are getting reported as corrected bitflips? That does sound bad. > This is a hardware problem of this NAND controller. > I plan to remove the 1bit/ECC mode support in patch of next version. OK, sounds good. 1-bit HW ECC is not really very useful these days anyway, if your higher-bit ECC can serve to replace it. Can the ECC bytes still fit in the same spare area, though? > > Are you saying you cannot implement the raw() hooks for this IP? Or just > > that you haven't yet? The latter is probably OK for now (I'd recommend > > doing this, or at least mark a TODO in the code), but the former is a > > little disturbing. > > The function of raw() hooks is just writing the page data to flash, is this right? Right, just data (and OOB, if calling the _oob_ functions) without any ECC parity bytes. > In none ECC mode, it can write page date alone to flash. But in ECC mode, NAND > controller will produce related ECC code automatically, write page data and ECC code > to flash. In ECC mode, it can not write page date alone to flash for this NAND controller. Perhaps you can switch between ECC mode and non-ECC mode? At any rate, this isn't absolutely required. > As a result, the nandbiterrs test can not pass. > > I don't know if I have explained these two problems clearly. If still have something > confused, please let me know. Brian -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html