Re: error propagation problem on xfs over dm stripe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 01 2017 at  2:42am -0500,
Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

> On Tue, Jan 31, 2017 at 09:12:07PM -0600, Eric Sandeen wrote:
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index 3086da5..3555ba8 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -808,7 +808,9 @@ static void dec_pending(struct dm_io *io, int error)
> >  		} else {
> >  			/* done with normal IO or empty flush */
> >  			trace_block_bio_complete(md->queue, bio, io_error);
> > -			bio->bi_error = io_error;
> > +			/* don't overwrite or clear existing errors */
> > +			if (!bio->bi_error)
> > +				bio->bi_error = io_error;
> >  			bio_endio(bio);
> >  		}
> >  	}
> > 
> > but Mike was a little uneasy, not knowing for sure how we got here to
> > overwrite this bio's error (hopefully I'm representing his concerns
> > fairly and correctly).

Well that is just it, I'm not seeing how io_error (io->error) can ever
transition from non-zero to zero.  And bio->bi_error shouldn't be set
without having first set io->error.  But just cause I cannot see it
doesn't change the fact that it is clearly happening to you.

It does concern me that this kind of fundamental error propagation
change is needed.  Speaks to a regression.  Would be nice to bisect
this.. Eric? ;)
 
> FYI, what we do both in the XFS buffer cache and the new direct I/O
> code is to use a
> 
> 	        cmpxchg(&dio->error, 0, ret);
> 
> in a similar situation, which should work here, too.

What is the benefit?  Faster than the conditional?

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux