Re: [PATCH] mtd: rawnand: micron: handle "ecc off" devices correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Miguel,

Am Freitag, den 26.07.2019, 10:28 +0200 schrieb Miquel Raynal:
> Hi Marco,
> 
> + Richard
> + Working e-mail address for Boris
> 
> > Marco Felsch <m.felsch@xxxxxxxxxxxxxx> wrote on Fri, 26 Jul 2019
> 09:44:34 +0200:
> 
> > Some devices don't support ecc "official". By "official" I mean that the
> > feature can be set trough the "SET FEATURE (EFh)" command but isn't
> > reported to the "READ ID Parameter Tables". Because the "ECC Field"
> > still says that it is disabled. This is applicable at least
> > for the MT29F2G08ABAGA and MT29F2G08ABBGA devices. Even worse the
> > datasheet describes the ECC feature in chapter "ECC Protection".
> > 
> > Currently the driver checks the "READ ID Parameter" field directly after
> > we enabled the feature. If the check fails we return immediately but
> > leave the ECC on. Now all future read/program cycles goes trough the ecc
> > and the host nfc gets confused and reports ECC errors.
> > 
> > To address this in a common way we need to turn off the ECC directly
> > after reading the "READ ID Parameter" and before checking the
> > "ECC status".
> > 
> > Signed-off-by: Marco Felsch <m.felsch@xxxxxxxxxxxxxx>
> 
> Good catch! However you report that on-die ECC correction is working
> but you still disable it; any reason to do so ? Would it be better to
> actually enable on-die ECC and explicitly mark these two chips as
> buggy (see [1] for checking the chip IDs)?

It's the other way around. The chip is not supposed to have on-die ECC
according to the datasheet and correctly reflects this fact in the
READ_ID, so Linux should not try to use the on-die ECC.

The bug is that the NAND is not supposed to have on-die ECC and reports
this correctly, but then actually enables a on-die ECC unit when asked
to, probably due to the same die being used for on-die ECC and ECC off
devices. The consequence is that Linux (correctly) assumes that the
full OOB size is available to the controller, but the on-die ECC unit
scribbles over some of the OOB data.

I think this fix the most robust solution, as it makes sure to disable
the on-die ECC unit to avoid the issue, which might also be present on
other NAND chips we don't know about yet.

Regards,
Lucas 

> [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_macronix.c#L83
> 
> > ---
> >  drivers/mtd/nand/raw/nand_micron.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
> > index 1622d3145587..fb199ad2f1a6 100644
> > --- a/drivers/mtd/nand/raw/nand_micron.c
> > +++ b/drivers/mtd/nand/raw/nand_micron.c
> > @@ -390,6 +390,14 @@ static int micron_supports_on_die_ecc(struct nand_chip *chip)
> > > >  	    (chip->id.data[4] & MICRON_ID_INTERNAL_ECC_MASK) != 0x2)
> > > >  		return MICRON_ON_DIE_UNSUPPORTED;
> >  
> > > > +	/*
> > > > +	 * It seems that there are devices which do not support ECC official.
> > > > +	 * At least the MT29F2G08ABAGA / MT29F2G08ABBGA devices supports
> > > > +	 * enabling the ECC feature but don't reflect that to the READ_ID table.
> > > > +	 * So we have to guarantee that we disable the ECC feature directly
> > > > +	 * after we did the READ_ID table command. Later we can evaluate the
> > > > +	 * ECC_ENABLE support.
> > > > +	 */
> > > >  	ret = micron_nand_on_die_ecc_setup(chip, true);
> > > >  	if (ret)
> > > >  		return MICRON_ON_DIE_UNSUPPORTED;
> > @@ -398,13 +406,13 @@ static int micron_supports_on_die_ecc(struct nand_chip *chip)
> > > >  	if (ret)
> > > >  		return MICRON_ON_DIE_UNSUPPORTED;
> >  
> > > > -	if (!(id[4] & MICRON_ID_ECC_ENABLED))
> > > > -		return MICRON_ON_DIE_UNSUPPORTED;
> > -
> > > >  	ret = micron_nand_on_die_ecc_setup(chip, false);
> > > >  	if (ret)
> > > >  		return MICRON_ON_DIE_UNSUPPORTED;
> >  
> > > > +	if (!(id[4] & MICRON_ID_ECC_ENABLED))
> > > > +		return MICRON_ON_DIE_UNSUPPORTED;
> > +
> > > >  	ret = nand_readid_op(chip, 0, id, sizeof(id));
> > > >  	if (ret)
> >  		return MICRON_ON_DIE_UNSUPPORTED;
> 
> Thanks,
> Miquèl
> 

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/




[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux