Re: [PATCH] jbd2: Fix a race between checkpointing code and journal_get_write_access()

Theodore Tso <tytso@xxxxxxx> · Wed, 8 Jul 2009 18:31:50 -0400

On Sun, Jul 05, 2009 at 10:53:19PM -0400, Theodore Tso wrote:
> On Wed, Jun 24, 2009 at 06:02:40PM +0200, Jan Kara wrote:
> > The following race can happen:
> > 
> >   CPU1                          CPU2
> >                                 checkpointing code checks the buffer, adds
> >                                   it to an array for writeback
> > do_get_write_access()
> >   ...
> >   lock_buffer()
> >   unlock_buffer()
> >                                   flush_batch() submits the buffer for IO
> >   __jbd2_journal_file_buffer()
> > 
> >   So a buffer under writeout is returned from do_get_write_access(). Since
> > the filesystem code relies on the fact that journaled buffers cannot be
> > written out, it does not take the buffer lock and so it can modify buffer
> > while it is under writeout. That can lead to a filesystem corruption
> > if we crash at the right moment.
> >   We fix the problem by clearing the buffer dirty bit under buffer_lock
> > even if the buffer is on BJ_None list. Actually, we clear the dirty bit
> > regardless the list the buffer is in and warn about the fact if
> > the buffer is already journalled.

When running fsstress, we get the "Spotted dirty metadata buffer;
there's a risk of filesystem corruption in csae of a system crash" at
least half a dozen times or so.  That sounds like we have a problem.
Were you expecting that this was a "this should never happen"
situation, or is there a known bug that we need to fix here?

	      	       	       	   	- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html