Re: [patch][rfc] fs: fix nobh error handling

Nick Piggin <npiggin@xxxxxxx> · Wed, 8 Aug 2007 05:12:35 +0200

On Tue, Aug 07, 2007 at 07:33:47PM -0700, Andrew Morton wrote:
> On Wed, 8 Aug 2007 04:18:38 +0200 Nick Piggin <npiggin@xxxxxxx> wrote:
> 
> > On Tue, Aug 07, 2007 at 06:09:03PM -0700, Andrew Morton wrote:
> > > 
> > > With this change, nobh_prepare_write() can magically attach buffers to the
> > > page.  But a filesystem which is running in nobh mode wouldn't expect that,
> > > and could quite legitimately go BUG, or leak data or, more seriously and
> > > much less fixably, just go and overwrite page->private, because it "knows"
> > > that nobody else is using ->private.
> > 
> > I was fairly sure that a filesystem can not assume buffers won't be
> > attached, because there are various error case paths thta do exactly
> > the same thing (eg. nobh_writepage can call __block_write_full_page
> > which will attach buffers). 
> 
> oh crap, that's sad.  Either we broke it later on or I misremembered.
> 
> > Does any filesystem assume this? Is it not already broken?
> 
> Yes, it would be broken.
> 
> > 
> > > I'd have thought that it would be better to not attach the buffers and to
> > > go ahead and do whatever synchronous IO is needed in the error recovery
> > > code, then free those buffers again.
> > 
> > It is hard because if the synchronous IO fails, then what do you do?
> 
> Do what we usually do when an IO error happens: crash the kernel?  (Sorry,
> have been spending too long at bugzilla.kernel.org)

Heh.. I guess there is still a chance to retry the IO with sync or
fsync. I'mt not surprised if the "normal" pagecache error handling
paths doesn't work so well either, but at least if we can duplicate
as little code as possible it might get fixed up one day.

> > You could try making it up as you go along, but of course if we _can_
> > attach the buffers here then it would be preferable to do that. IMO.
> >  
> > 
> > > Also, you have a couple of (cheerily uncommented) PagePrivate() tests in
> > > there which should be page_has_buffers().
> > 
> > Yeah, I guess the whole thing needs more commenting :P
> > page_has_buffers... right, I'll change that.
> 
> Did it get much testing?

A little. Obviously it only really changes anything when an IO error hits,
and I found that ext3/jbd gives up and goes readonly pretty quickly when I
inject IO errors into the block device. What I really want to do is just
inject faults at nobh_prepare_write and do some longer tests.

I'll do that today. 
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html