Wrong address for Boris again, sorry for the noise. > Hi Lucas, Marco, > > Lucas Stach <l.stach@xxxxxxxxxxxxxx> wrote on Fri, 26 Jul 2019 10:54:11 > +0200: > > > Hi Miguel, > > > > Am Freitag, den 26.07.2019, 10:28 +0200 schrieb Miquel Raynal: > > > Hi Marco, > > > > > > + Richard > > > + Working e-mail address for Boris > > > > > > > Marco Felsch <m.felsch@xxxxxxxxxxxxxx> wrote on Fri, 26 Jul 2019 > > > 09:44:34 +0200: > > > > > > > Some devices don't support ecc "official". By "official" I mean that the > > ^ uppercase ECC > > > > > feature can be set trough the "SET FEATURE (EFh)" command but isn't > > > > reported to the "READ ID Parameter Tables". Because the "ECC Field" > > > > still says that it is disabled. This is applicable at least > > > > for the MT29F2G08ABAGA and MT29F2G08ABBGA devices. Even worse the > > > > datasheet describes the ECC feature in chapter "ECC Protection". > > What about: > > "Some devices are supposed to do not support on-die ECC but > experience shows that internal ECC machinery can actually be enabled > through the "SET FEATURE (EFh)" command, even if a read of the "READ ID > Parameter Tables" returns that it is not." > > > > > > > > > Currently the driver checks the "READ ID Parameter" field directly after > > > > we enabled the feature. If the check fails we return immediately but > > > > leave the ECC on. Now all future read/program cycles goes trough the ecc > > > > and the host nfc gets confused and reports ECC errors. > > And here: > > "Currently, the driver checks the "READ ID Parameter" field > directly after having enabled the feature. If the check fails it returns > immediately but leaves the ECC on. When using buggy chips like > MT29F2G08ABAGA and MT29F2G08ABBGA, all future read/program cycles will > go through the on-die ECC, confusing the host controller which is > supposed to be the one handling correction." > > > > > To address this in a common way we need to turn off the ECC directly > > > > after reading the "READ ID Parameter" and before checking the > > > > "ECC status". > > > > > > > > Signed-off-by: Marco Felsch <m.felsch@xxxxxxxxxxxxxx> > > > > > > Good catch! However you report that on-die ECC correction is working > > > but you still disable it; any reason to do so ? Would it be better to > > > actually enable on-die ECC and explicitly mark these two chips as > > > buggy (see [1] for checking the chip IDs)? > > > > It's the other way around. The chip is not supposed to have on-die ECC > > according to the datasheet and correctly reflects this fact in the > > READ_ID, so Linux should not try to use the on-die ECC. > > Ok I understood the opposite because of the "Even worse the datasheet > describes the ECC feature [...]" which implied to me that the on-die ECC > feature was actually expected despite the status bit not being set. > > Marco, can you rephrase a bit the commit log? I proposed something, > feel free to adapt. > > > The bug is that the NAND is not supposed to have on-die ECC and reports > > this correctly, but then actually enables a on-die ECC unit when asked > > to, probably due to the same die being used for on-die ECC and ECC off > > devices. The consequence is that Linux (correctly) assumes that the > > full OOB size is available to the controller, but the on-die ECC unit > > scribbles over some of the OOB data. > > > > I think this fix the most robust solution, as it makes sure to disable > > the on-die ECC unit to avoid the issue, which might also be present on > > other NAND chips we don't know about yet. > > > > Regards, > > Lucas > > > > > [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_macronix.c#L83 > > > > > > > --- > > > > drivers/mtd/nand/raw/nand_micron.c | 14 +++++++++++--- > > > > 1 file changed, 11 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c > > > > index 1622d3145587..fb199ad2f1a6 100644 > > > > --- a/drivers/mtd/nand/raw/nand_micron.c > > > > +++ b/drivers/mtd/nand/raw/nand_micron.c > > > > @@ -390,6 +390,14 @@ static int micron_supports_on_die_ecc(struct nand_chip *chip) > > > > > > (chip->id.data[4] & MICRON_ID_INTERNAL_ECC_MASK) != 0x2) > > > > > > return MICRON_ON_DIE_UNSUPPORTED; > > > > > > > > > > + /* > > > > > > + * It seems that there are devices which do not support ECC official. > > > > > > + * At least the MT29F2G08ABAGA / MT29F2G08ABBGA devices supports > > > > > > + * enabling the ECC feature but don't reflect that to the READ_ID table. > > > > > > + * So we have to guarantee that we disable the ECC feature directly > > > > > > + * after we did the READ_ID table command. Later we can evaluate the > > > > > > + * ECC_ENABLE support. > > > > > > + */ > > > > > > ret = micron_nand_on_die_ecc_setup(chip, true); > > > > > > if (ret) > > > > > > return MICRON_ON_DIE_UNSUPPORTED; > > > > @@ -398,13 +406,13 @@ static int micron_supports_on_die_ecc(struct nand_chip *chip) > > > > > > if (ret) > > > > > > return MICRON_ON_DIE_UNSUPPORTED; > > > > > > > > > > - if (!(id[4] & MICRON_ID_ECC_ENABLED)) > > > > > > - return MICRON_ON_DIE_UNSUPPORTED; > > > > - > > > > > > ret = micron_nand_on_die_ecc_setup(chip, false); > > > > > > if (ret) > > > > > > return MICRON_ON_DIE_UNSUPPORTED; > > > > > > > > > > + if (!(id[4] & MICRON_ID_ECC_ENABLED)) > > > > > > + return MICRON_ON_DIE_UNSUPPORTED; > > > > + > > > > > > ret = nand_readid_op(chip, 0, id, sizeof(id)); > > > > > > if (ret) > > > > return MICRON_ON_DIE_UNSUPPORTED; > > > > > > Thanks, > > > Miquèl > > > > > > Thanks, > Miquèl Thanks, Miquèl ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/