On 12.06.2018 10:13, Boris Brezillon wrote: > On Tue, 12 Jun 2018 10:02:12 +0200 > Stefan Agner <stefan@xxxxxxxx> wrote: > >> >> +static int tegra_nand_read_page_hwecc(struct mtd_info *mtd, >> >> + struct nand_chip *chip, >> >> + uint8_t *buf, int oob_required, int page) >> >> +{ >> >> + struct tegra_nand_controller *ctrl = to_tegra_ctrl(chip->controller); >> >> + struct tegra_nand_chip *nand = to_tegra_chip(chip); >> >> + void *oob_buf = oob_required ? chip->oob_poi : 0; >> >> + u32 dec_stat, max_corr_cnt; >> >> + unsigned long fail_sec_flag; >> >> + int ret; >> >> + >> >> + tegra_nand_hw_ecc(ctrl, chip, true); >> >> + ret = tegra_nand_page_xfer(mtd, chip, buf, oob_buf, nand->tag.length, >> >> + page, true); >> >> + tegra_nand_hw_ecc(ctrl, chip, false); >> >> + if (ret) >> >> + return ret; >> >> + >> >> + /* No correctable or un-correctable errors, page must have 0 bitflips */ >> >> + if (!ctrl->last_read_error) >> >> + return 0; >> >> + >> >> + /* >> >> + * Correctable or un-correctable errors occurred. Use DEC_STAT_BUF >> >> + * which contains information for all ECC selections. >> >> + * >> >> + * Note that since we do not use Command Queues DEC_RESULT does not >> >> + * state the number of pages we can read from the DEC_STAT_BUF. But >> >> + * since CORRFAIL_ERR did occur during page read we do have a valid >> >> + * result in DEC_STAT_BUF. >> >> + */ >> >> + ctrl->last_read_error = false; >> >> + dec_stat = readl_relaxed(ctrl->regs + DEC_STAT_BUF); >> >> + >> >> + fail_sec_flag = (dec_stat & DEC_STAT_BUF_FAIL_SEC_FLAG_MASK) >> >> >> + DEC_STAT_BUF_FAIL_SEC_FLAG_SHIFT; >> >> + >> >> + max_corr_cnt = (dec_stat & DEC_STAT_BUF_MAX_CORR_CNT_MASK) >> >> >> + DEC_STAT_BUF_MAX_CORR_CNT_SHIFT; >> >> + >> >> + if (fail_sec_flag) { >> >> + int bit, max_bitflips = 0; >> >> + >> >> + /* >> >> + * Check if all sectors in a page failed. If only some failed >> >> + * its definitly not an erased page and we can return error >> >> + * stats right away. >> >> + * >> >> + * E.g. controller might return fail_sec_flag with 0x4, which >> >> + * would mean only the third sector failed to correct. > > That works because you have NAND_NO_SUBPAGE_WRITE set (i.e. no partial > page programming), probably something you should state here. > Ok, will add a note. >> >> + */ >> >> + if (fail_sec_flag ^ GENMASK(chip->ecc.steps - 1, 0)) { >> >> + mtd->ecc_stats.failed += hweight8(fail_sec_flag); >> >> + return max_corr_cnt; >> >> + } >> >> + >> >> + /* >> >> + * All sectors failed to correct, but the ECC isn't smart >> >> + * enough to figure out if a page is really completely erased. >> >> + * We check the read data here to figure out if it's a >> >> + * legitimate ECC error or only an erased page. >> >> + */ >> >> + for_each_set_bit(bit, &fail_sec_flag, chip->ecc.steps) { >> >> + u8 *data = buf + (chip->ecc.size * bit); >> >> + >> >> + ret = nand_check_erased_ecc_chunk(data, chip->ecc.size, >> >> + NULL, 0, > > You should also check that the ECC bytes are 0xff here, otherwise you > won't detect corruption of pages almost filled 0xff but with a few bits > set to 0. > > When you use nand_check_erased_ecc_chunk(), it's important to always > pass the data along with its associated ECC bytes. > Hm, I see this is important in case bitflips accumulate in OOB area only. >> >> + NULL, 0, > > If you support writing extra OOB bytes, you should also pass them here. > I see. OOB bytes handled together with the last subpage. >> >> + chip->ecc.strength); >> >> + if (ret < 0) >> >> + mtd->ecc_stats.failed++; >> >> + else >> >> + max_bitflips = max(ret, max_bitflips); I guess I should also increment ecc_stats.corrected here. Is it correct that I increment for every step? So if I have an ECC strength of 16, an empty page could have 8 bitflips in the first step, and 12 in the second, I would increment mtd->ecc_stats.corrected by 20 but return 12 (maximum number of bitflips per step)? -- Stefan >> >> + } >> >> + >> >> + return max_t(unsigned int, max_corr_cnt, max_bitflips); >> >> + } else { >> >> + int corr_sec_flag; >> >> + >> >> + corr_sec_flag = (dec_stat & DEC_STAT_BUF_CORR_SEC_FLAG_MASK) >> >> >> + DEC_STAT_BUF_CORR_SEC_FLAG_SHIFT; >> >> + >> >> + /* >> >> + * The value returned in the register is the maximum of >> >> + * bitflips encountered in any of the ECC regions. As there is >> >> + * no way to get the number of bitflips in a specific regions >> >> + * we are not able to deliver correct stats but instead >> >> + * overestimate the number of corrected bitflips by assuming >> >> + * that all regions where errors have been corrected >> >> + * encountered the maximum number of bitflips. >> >> + */ >> >> + mtd->ecc_stats.corrected += max_corr_cnt * hweight8(corr_sec_flag); >> >> + >> >> + return max_corr_cnt; >> >> + } >> >> + -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html