Re: [PATCH 1/7] xfs: add a function to deal with corrupt buffers post-verifiers

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Wed, 11 Mar 2020 09:42:09 -0700

On Wed, Mar 11, 2020 at 04:35:59PM +1100, Dave Chinner wrote:
> On Tue, Mar 10, 2020 at 05:47:19PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > 
> > Add a helper function to get rid of buffers that we have decided are
> > corrupt after the verifiers have run.  This function is intended to
> > handle metadata checks that can't happen in the verifiers, such as
> > inter-block relationship checking.  Note that we now mark the buffer
> > stale so that it will not end up on any LRU and will be purged on
> > release.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > ---
> ....
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index 11a97bc35f70..9d26e37f78f5 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -16,6 +16,8 @@
> >  #include "xfs_log.h"
> >  #include "xfs_errortag.h"
> >  #include "xfs_error.h"
> > +#include "xfs_trans.h"
> > +#include "xfs_buf_item.h"
> >  
> >  static kmem_zone_t *xfs_buf_zone;
> >  
> > @@ -1572,6 +1574,29 @@ xfs_buf_zero(
> >  	}
> >  }
> >  
> > +/*
> > + * Log a message about and stale a buffer that a caller has decided is corrupt.
> > + *
> > + * This function should be called for the kinds of metadata corruption that
> > + * cannot be detect from a verifier, such as incorrect inter-block relationship
> > + * data.  Do /not/ call this function from a verifier function.
> 
> So if it's called from a read verifier, the buffer will not have the
> XBF_DONE flag set on it. Maybe we should assert that this flag is
> set on the buffer? Yes, I know XBF_DONE will be set at write time,
> but most verifiers are called from both the read and write path so
> it should catch invalid use at read time...

<nod>

> 
> > + * The buffer must not be dirty prior to the call.  Afterwards, the buffer will
> 
> Why can't it be dirty?

There isn't a technical reason why this function cannot handle a dirty
buffer, but it seems like a mistake in data handling to dirty a buffer
and /then/ decide it's garbage.  That's what the lengthy assertion below
is about...

> > + * be marked stale, but b_error will not be set.  The caller is responsible for
> > + * releasing the buffer or fixing it.
> > + */
> > +void
> > +__xfs_buf_mark_corrupt(
> > +	struct xfs_buf		*bp,
> > +	xfs_failaddr_t		fa)
> > +{
> > +	ASSERT(bp->b_log_item == NULL ||
> > +	       !(bp->b_log_item->bli_flags & XFS_BLI_DIRTY));
> 
> XFS_BLI_DIRTY isn't a complete definition of a dirty buffer. What it
> means is "modifications to this buffer are not yet
> committed to the journal". It may have been linked into a
> transaction but not modified, but it can still be XFS_BLI_DIRTY
> because it is in the CIL. IOWs, transactions can cancel safely
> aborted even when the items in it are dirty as long as the
> transaction itself is clean and made no modifications to the object.
> 
> Hence I'm not sure what you are trying to protect against here?

...ah, ok, I'll just get rid of the assertion entirely then.

--D

> 
> The rest of the code looks fine.
> 
> Cheers,
> 
> Dave,
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx