Re: [PATCH v1 11/11] mtd: new support oops logger based on pstore/blk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi liao,

liaoweixiong <liaoweixiong@xxxxxxxxxxxxxxxxx> wrote on Thu, 6 Feb 2020
21:10:47 +0800:

> hi Miquel Raynal,
> 
> On 2020/1/23 AM 1:41, Miquel Raynal wrote:
> > Hello,
> > 
> >   
> >>>>>> +/*
> >>>>>> + * All zones will be read as pstore/blk will read zone one by one when do
> >>>>>> + * recover.
> >>>>>> + */
> >>>>>> +static ssize_t mtdpstore_read(char *buf, size_t size, loff_t off)
> >>>>>> +{
> >>>>>> +	struct mtdpstore_context *cxt = &oops_cxt;
> >>>>>> +	size_t retlen;
> >>>>>> +	int ret;
> >>>>>> +
> >>>>>> +	if (mtdpstore_block_isbad(cxt, off))
> >>>>>> +		return -ENEXT;
> >>>>>> +
> >>>>>> +	pr_debug("try to read off 0x%llx size %zu\n", off, size);
> >>>>>> +	ret = mtd_read(cxt->mtd, off, size, &retlen, (u_char *)buf);
> >>>>>> +	if ((ret < 0 && !mtd_is_bitflip(ret)) || size != retlen)  {  
> >>>>>
> >>>>> IIRC size != retlen does not mean it failed, but that you should
> >>>>> continue reading after retlen bytes, no?  
> >>>>>     >>  
> >>>> Yes, you are right. I will fix it. Thanks.  
> >>>>   >>>>> Also, mtd_is_bitflip() does not mean that you are reading a false  
> >>>>> buffer, but that the data has been corrected as it contained bitflips.
> >>>>> mtd_is_eccerr() however, would be meaningful.  
> >>>>>     >>  
> >>>> Sure I know mtd_is_bitflip() does not mean failure, but I do not think
> >>>> mtd_is_eccerr() should be here since the codes are ret < 0 and NOT
> >>>> mtd_is_bitflip().  
> >>>
> >>> Yes, just drop this check, only keep ret < 0.  
> >>>    >>  
> >> If I don't get it wrong, it should not	 be dropped here. Like your words,
> >> "mtd_is_bitflip() does not mean that you are reading a false buffer,
> >> but that the data has been corrected as it contained bitflips.", the
> >> data I get are valid even if mtd_is_bitflip() return true. It's correct
> >> data and it's no need to go to handle error. To me, the codes
> >> should be:
> >> 	if (ret < 0 && !mit_is_bitflip())
> >> 		[error handling]  
> > 
> > Please check the implementation of mtd_is_bitflip(). You'll probably
> > figure out what I am saying.
> > 
> > https://elixir.bootlin.com/linux/latest/source/include/linux/mtd/mtd.h#L585
> >   
> 
> How about the codes as follows:
> 
> for (done = 0, retlen = 0; done < size; done += retlen) {
> 	ret = mtd_read(..., &retlen, ...);
> 	if (!ret)
> 		continue;
> 	/*
> 	 * do nothing if bitflip and ecc error occurs because whether
> 	 * it's bitflip or ECC error, just a small number of bits flip
> 	 * and the impact on log data is so small. The mtdpstore just
> 	 * hands over what it gets and user can judge whether the data
> 	 * is valid or not.
> 	 */
> 	if (mtd_is_bitflip(ret)) {
> 		dev_warn("bitflip at....");
> 		continue;

I don't understand why do you check for bitflips. Bitflips have been
corrected at this stage, you just get the information that there
has been bitflips, but the data integrity is fine.

I am not against ignoring ECC errors in this case though. I would
propose:

	for (...) {
		if (ret < 0) {
			complain;
			return;
		}

		if (mtd_is_eccerr())
			complain;
	}
		
> 	} else if (mtd_is_eccerr(ret)) {
> 		dev_warn("eccerr at....");
> 		retlen = retlen == 0 ? size : retlen;
> 		continue;
> 	} else {
> 		dev_err("read failure at...");
> 		/* this zone is broken, try next one */
> 		return -ENEXT;
> 	}
> }
> 


Thanks,
Miquèl




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux