On 01/21/2016 02:21 PM, Boris Brezillon wrote:
Hi Archit,
On Thu, 21 Jan 2016 12:43:18 +0530
Archit Taneja <architt@xxxxxxxxxxxxxx> wrote:
The Qualcomm NAND controller is found in SoCs like IPQ806x, MSM7xx,
MDM9x15 series.
It exists as a sub block inside the IPs EBI2 (External Bus Interface 2)
and QPIC (Qualcomm Parallel Interface Controller). These IPs provide a
broader interface for external slow peripheral devices such as LCD and
NAND/NOR flash memory or SRAM like interfaces.
We add support for the NAND controller found within EBI2. For the SoCs
of our interest, we only use the NAND controller within EBI2. Therefore,
it's safe for us to assume that the NAND controller is a standalone block
within the SoC.
The controller supports 512B, 2kB, 4kB and 8kB page 8-bit and 16-bit NAND
flash devices. It contains a HW ECC block that supports BCH ECC (4, 8 and
16 bit correction/step) and RS ECC(4 bit correction/step) that covers main
and spare data. The controller contains an internal 512 byte page buffer
to which we read/write via DMA. The EBI2 type NAND controller uses ADM DMA
for register read/write and data transfers. The controller performs page
reads and writes at a codeword/step level of 512 bytes. It can support up
to 2 external chips of different configurations.
The driver prepares register read and write configuration descriptors for
each codeword, followed by data descriptors to read or write data from the
controller's internal buffer. It uses a single ADM DMA channel that we get
via dmaengine API. The controller requires 2 ADM CRCIs for command and
data flow control. These are passed via DT.
The ecc layout used by the controller is syndrome like, but we can't use
the standard syndrome ecc ops because of several reasons. First, the amount
of data bytes covered by ecc isn't same in each step. Second, writing to
free oob space requires us writing to the entire step in which the oob
lies. This forces us to create our own ecc ops.
One more difference is how the controller accesses the bad block marker.
The controller ignores reading the marker when ECC is enabled. ECC needs
to be explicity disabled to read or write to the bad block marker. The
nand_bbt helpers library hence can't access BBMs for the controller.
For now, we skip the creation of BBT and populate chip->block_bad and
chip->block_markbad helpers instead.
Reviewed-by: Andy Gross <agross@xxxxxxxxxxxxxx>
Signed-off-by: Stephen Boyd <sboyd@xxxxxxxxxxxxxx>
Signed-off-by: Archit Taneja <architt@xxxxxxxxxxxxxx>
Sorry, I noticed one more thing in your "bitflips in erased pages"
handling. Once this is addressed (or explained) you can add my
Reviewed-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxxxxxxx>
Thanks! I've given an explanation below.
---
v7:
- Incorporated missing/new comments by Boris
- Cleaned up some strict checkpatch warnings
v6:
- Fix up erased page parsing. Use nand_check_erased_ecc_chunk to
return corrected bitflips in an erased page.
- Fix whitespace issues
- Update compatible tring to something more specific
v5:
- split chip/controller structs
- simplify layout by considering reserved bytes as part of ECC
- create ecc layouts automatically
- implement block_bad and block_markbad chip ops instead of
- read_oob_raw/write_oob_raw ecc ops to access BBMs.
- Add NAND_SKIP_BBTSCAN flag until we get badblockbits support.
- misc clean ups
v4:
- Shrink submit_descs
- add desc list node at the end of dma_prep_desc
- Endianness and warning fixes
- Add Stephen's Signed-off since he provided a patch to fix
endianness problems
v3:
- Refactor dma functions for maximum reuse
- Use dma_slave_confing on stack
- optimize and clean upempty_page_fixup using memchr_inv
- ensure portability with dma register reads using le32_* funcs
- use NAND_USE_BOUNCE_BUFFER instead of doing it ourselves
- fix handling of return values of dmaengine funcs
- constify wherever possible
- Remove dependency on ADM DMA in Kconfig
- Misc fixes and clean ups
v2:
- Use new BBT flag that allows us to read BBM in raw mode
- reduce memcpy-s in the driver
- some refactor and clean ups because of above changes
drivers/mtd/nand/Kconfig | 7 +
drivers/mtd/nand/Makefile | 1 +
drivers/mtd/nand/qcom_nandc.c | 2024 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 2032 insertions(+)
create mode 100644 drivers/mtd/nand/qcom_nandc.c
[...]
diff --git a/drivers/mtd/nand/qcom_nandc.c b/drivers/mtd/nand/qcom_nandc.c
new file mode 100644
index 0000000..269d388
--- /dev/null
+++ b/drivers/mtd/nand/qcom_nandc.c
[...]
+/*
+ * when using BCH ECC, the HW flags an error in NAND_FLASH_STATUS if it read
+ * an erased CW, and reports an erased CW in NAND_ERASED_CW_DETECT_STATUS.
+ *
+ * when using RS ECC, the HW reports the same erros when reading an erased CW,
+ * but it notifies that it is an erased CW by placing special characters at
+ * certain offsets in the buffer.
+ *
+ * verify if the page is erased or not, and fix up the page for RS ECC by
+ * replacing the special characters with 0xff
+ */
+static bool empty_page_fixup(struct qcom_nand_host *host, u8 *data_buf)
+{
+ struct nand_chip *chip = &host->chip;
+ struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
+ struct nand_ecc_ctrl *ecc = &chip->ecc;
+ struct read_stats *buf;
+ int i;
+
+ buf = (struct read_stats *)nandc->reg_read_buf;
+
+ for (i = 0; i < ecc->steps; i++, buf++) {
+ u32 flash, erased_cw;
+ u8 empty1, empty2;
+
+ flash = le32_to_cpu(buf->flash);
+ erased_cw = le32_to_cpu(buf->erased_cw);
+
+ /*
+ * an erased page flags an error in NAND_FLASH_STATUS, if there
+ * isn't any error, bail out early and report a non-erased
+ * page
+ */
+ if (!(flash & FS_OP_ERR))
+ break;
+
+ /*
+ * if BCH is enabled, HW will take care of detecting erased
+ * pages
+ */
+ if (host->bch_enabled) {
+ /* bail out if we didn't detect an erased CW */
+ if ((erased_cw & ERASED_CW) != ERASED_CW)
+ break;
+ } else {
+ /*
+ * if RS ECC is enabled, check if the CW is erased by
+ * looking for 0x54s at offsets 3 and 175
+ */
+ empty1 = data_buf[3 + i * host->cw_data];
+ empty2 = data_buf[175 + i * host->cw_data];
+
+ /* bail out if the CW isn't erased */
+ if (!(empty1 == 0x54 && empty2 == 0xff) &&
+ !(empty1 == 0xff && empty2 == 0x54))
+ break;
+ }
+ }
+
+ if (i < ecc->steps)
+ return false;
+
+ if (!host->bch_enabled) {
+ /*
+ * fix up the buffer by replacing the magic offsets with
+ * 0xff
+ */
+ for (i = 0; i < ecc->steps; i++) {
+ data_buf[3 + i * host->cw_data] = 0xff;
+ data_buf[175 + i * host->cw_data] = 0xff;
+ }
+ }
+
+ return true;
+}
+
+/*
+ * reads back status registers set by the controller to notify page read
+ * errors. this is equivalent to what 'ecc->correct()' would do.
+ */
+static int parse_read_errors(struct qcom_nand_host *host, u8 *data_buf,
+ u8 *oob_buf, bool erased_page)
+{
+ struct nand_chip *chip = &host->chip;
+ struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
+ struct mtd_info *mtd = nand_to_mtd(chip);
+ struct nand_ecc_ctrl *ecc = &chip->ecc;
+ unsigned int max_bitflips = 0;
+ struct read_stats *buf;
+ int i;
+
+ buf = (struct read_stats *)nandc->reg_read_buf;
+
+ for (i = 0; i < ecc->steps; i++, buf++) {
+ u32 flash, buffer;
+ int data_len, oob_len;
+
+ if (i == (ecc->steps - 1)) {
+ data_len = ecc->size - ((ecc->steps - 1) << 2);
+ oob_len = ecc->steps << 2;
+ } else {
+ data_len = host->cw_data;
+ oob_len = 0;
+ }
+
+ flash = le32_to_cpu(buf->flash);
+ buffer = le32_to_cpu(buf->buffer);
+
+ if (flash & (FS_OP_ERR | FS_MPU_ERR)) {
+ if (erased_page) {
+ int ret, ecclen, extraooblen;
+ void *eccbuf;
+
+ eccbuf = oob_buf ? oob_buf + oob_len : NULL;
+ ecclen = oob_buf ? host->ecc_bytes_hw : 0;
+ extraooblen = oob_buf ? oob_len : 0;
+
+ ret = nand_check_erased_ecc_chunk(data_buf,
+ data_len, eccbuf, ecclen, oob_buf,
+ extraooblen, ecc->strength);
IIUC, the erased_page info is returned by empty_page_fixup() and is
only set if the page is detected as empty (filled with ff).
If that's the case, then you don't have to use
nand_check_erased_ecc_chunk() to check it again...
empty_page_fixup now doesn't parse the entire page for 0xffs, it just
checks if the correct flags have been raised by the controller hardware,
and replaces the 'special offsets' with 0xffs instead of 0x54s.
From what I understood, we still need to parse the chunks to try to
fix 'ecc->strength' number of bitflips.
+ if (ret < 0) {
+ mtd->ecc_stats.failed++;
+ } else {
+ mtd->ecc_stats.corrected += ret;
+ max_bitflips =
+ max_t(unsigned int, max_bitflips, ret);
+ }
+ } else {
+ if (buffer & BS_UNCORRECTABLE_BIT) {
... here is where you should check if what was detected as
uncorrectable errors is not in fact some bitflips in an erased page.
This path will never hit for a page reported as erased by the HW. The
'else' branch happens only for pages that were reported as 'not erased'
by empty page fixup.
In other words, the BS_UNCORRECTABLE_BIT register bitfield is never
checked for an erased page. In some experiments I performed, I noticed
that this bitfield was almost always set for an erased page. There is
not point in even checking this field for an erased page.
Thanks,
Archit
+ mtd->ecc_stats.failed++;
+ } else {
+ unsigned int stat;
+
+ stat = buffer & BS_CORRECTABLE_ERR_MSK;
+ mtd->ecc_stats.corrected += stat;
+ max_bitflips = max(max_bitflips, stat);
+ }
+ }
+ }
+
+ data_buf += data_len;
+ if (oob_buf)
+ oob_buf += oob_len + ecc->bytes;
+ }
+
+ return max_bitflips;
+}
Best Regards,
Boris
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html