Hi liao, liaoweixiong <liaoweixiong@xxxxxxxxxxxxxxxxx> wrote on Thu, 6 Feb 2020 21:10:47 +0800: > hi Miquel Raynal, > > On 2020/1/23 AM 1:41, Miquel Raynal wrote: > > Hello, > > > > > >>>>>> +/* > >>>>>> + * All zones will be read as pstore/blk will read zone one by one when do > >>>>>> + * recover. > >>>>>> + */ > >>>>>> +static ssize_t mtdpstore_read(char *buf, size_t size, loff_t off) > >>>>>> +{ > >>>>>> + struct mtdpstore_context *cxt = &oops_cxt; > >>>>>> + size_t retlen; > >>>>>> + int ret; > >>>>>> + > >>>>>> + if (mtdpstore_block_isbad(cxt, off)) > >>>>>> + return -ENEXT; > >>>>>> + > >>>>>> + pr_debug("try to read off 0x%llx size %zu\n", off, size); > >>>>>> + ret = mtd_read(cxt->mtd, off, size, &retlen, (u_char *)buf); > >>>>>> + if ((ret < 0 && !mtd_is_bitflip(ret)) || size != retlen) { > >>>>> > >>>>> IIRC size != retlen does not mean it failed, but that you should > >>>>> continue reading after retlen bytes, no? > >>>>> >> > >>>> Yes, you are right. I will fix it. Thanks. > >>>> >>>>> Also, mtd_is_bitflip() does not mean that you are reading a false > >>>>> buffer, but that the data has been corrected as it contained bitflips. > >>>>> mtd_is_eccerr() however, would be meaningful. > >>>>> >> > >>>> Sure I know mtd_is_bitflip() does not mean failure, but I do not think > >>>> mtd_is_eccerr() should be here since the codes are ret < 0 and NOT > >>>> mtd_is_bitflip(). > >>> > >>> Yes, just drop this check, only keep ret < 0. > >>> >> > >> If I don't get it wrong, it should not be dropped here. Like your words, > >> "mtd_is_bitflip() does not mean that you are reading a false buffer, > >> but that the data has been corrected as it contained bitflips.", the > >> data I get are valid even if mtd_is_bitflip() return true. It's correct > >> data and it's no need to go to handle error. To me, the codes > >> should be: > >> if (ret < 0 && !mit_is_bitflip()) > >> [error handling] > > > > Please check the implementation of mtd_is_bitflip(). You'll probably > > figure out what I am saying. > > > > https://elixir.bootlin.com/linux/latest/source/include/linux/mtd/mtd.h#L585 > > > > How about the codes as follows: > > for (done = 0, retlen = 0; done < size; done += retlen) { > ret = mtd_read(..., &retlen, ...); > if (!ret) > continue; > /* > * do nothing if bitflip and ecc error occurs because whether > * it's bitflip or ECC error, just a small number of bits flip > * and the impact on log data is so small. The mtdpstore just > * hands over what it gets and user can judge whether the data > * is valid or not. > */ > if (mtd_is_bitflip(ret)) { > dev_warn("bitflip at...."); > continue; I don't understand why do you check for bitflips. Bitflips have been corrected at this stage, you just get the information that there has been bitflips, but the data integrity is fine. I am not against ignoring ECC errors in this case though. I would propose: for (...) { if (ret < 0) { complain; return; } if (mtd_is_eccerr()) complain; } > } else if (mtd_is_eccerr(ret)) { > dev_warn("eccerr at...."); > retlen = retlen == 0 ? size : retlen; > continue; > } else { > dev_err("read failure at..."); > /* this zone is broken, try next one */ > return -ENEXT; > } > } > Thanks, Miquèl