Re: [patch] fs: improved handling of page and buffer IO errors

Nick Piggin <npiggin@xxxxxxx> · Wed, 22 Oct 2008 12:31:13 +0200

On Tue, Oct 21, 2008 at 06:16:24PM +0200, Andi Kleen wrote:
> Nick Piggin <npiggin@xxxxxxx> writes:
> 
> > IO error handling in the core mm/fs still doesn't seem perfect, but with
> > the recent round of patches and this one, it should be getting on the
> > right track.
> >
> > I kind of get the feeling some people would rather forget about all this
> > and brush it under the carpet. Hopefully I'm mistaken, but if anybody
> > disagrees with my assertion that error handling, and data integrity
> > semantics are first-class correctness issues, and therefore are more
> > important than all other non-correctness problems... speak now and let's
> > discuss that, please.
> >
> > Otherwise, unless anybody sees obvious problems with this, hopefully it
> > can go into -mm for some wider testing (I've tested it with a few filesystems
> > so far and no immediate problems)
> 
> I think the first step to get these more robust in the future would be to
> have a standard regression test testing these paths.  Otherwise it'll
> bit-rot sooner or later again.

The problem I've had with testing is that it's hard to trigger a specific
path for a given error, because write IO especially can be quite non
deterministic, and the filesystem or kernel may give up at various points.

I agree, but I just don't know exactly how they can be turned into
standard tests. Some filesystems like XFS seem to completely shut down
quite easily on IO errors. Others like ext2 can't really unwind from
a failure in a multi-block operation (eg. allocating a block to an
inode) if an error is detected, and it just gets ignored.

I am testing, but mainly just random failure injections and seeing if
things go bug or go undetected etc.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html