Re: [PATCH 1/4 V2] xfs: catch buffers written without verifiers attached

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 31 Jul 2014 07:39:20 +1000

On Wed, Jul 30, 2014 at 12:29:14PM -0400, Brian Foster wrote:
> On Wed, Jul 30, 2014 at 12:30:24PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > We recently had a bug where buffers were slipping through log
> > recovery without any verifier attached to them. This was resulting
> > in on-disk CRC mismatches for valid data. Add some warning code to
> > catch this occurrence so that we catch such bugs during development
> > rather than not being aware they exist.
> > 
> > Note that we cannot do this verification unconditionally as non-CRC
> > filesystems don't always attach verifiers to the buffers being
> > written. e.g. during log recovery we cannot identify all the
> > different types of buffers correctly on non-CRC filesystems, so we
> > can't attach the correct verifiers in all cases and so we don't
> > attach any. Hence we don't want on non-CRC filesystems to avoid
> > spamming the logs with false indications.
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > ---
> >  fs/xfs/xfs_buf.c | 15 +++++++++++++++
> >  fs/xfs/xfs_log.c |  7 ++++++-
> >  2 files changed, 21 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index a6dc83e..078b8be 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -1330,6 +1330,21 @@ _xfs_buf_ioapply(
> >  						   SHUTDOWN_CORRUPT_INCORE);
> >  				return;
> >  			}
> > +		} else if (bp->b_bn != -1LL) {
> > +			struct xfs_mount *mp = bp->b_target->bt_mount;
> > +
> > +			/*
> > +			 * non-crc filesystems don't attach verifiers during
> > +			 * log recovery, so don't warn for such filesystems.
> > +			 */
> > +			if (xfs_sb_version_hascrc(&mp->m_sb)) {
> > +				xfs_warn(mp,
> > +					"%s: no ops on block 0x%llx/0x%llx",
> > +					__func__, bp->b_bn,
> > +					bp->b_maps[0].bm_bn);
> 
> Are you intending to print both block number values here or the
> b_bn/bm_len combo?

Yeah, I probably did. I didn't actuall look at the block numbers in
the output I kept getting - just the magic number in the hex dump
and the stack trace...

> 
> > +				xfs_hex_dump(bp->b_addr, 64);
> > +				dump_stack();
> > +			}
> >  		}
> >  	} else if (bp->b_flags & XBF_READ_AHEAD) {
> >  		rw = READA;
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index 149a4a5..9dc92b3 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -1378,8 +1378,13 @@ xlog_alloc_log(
> >  
> >  	xlog_get_iclog_buffer_size(mp, log);
> >  
> > +	/*
> > +	 * Use a block number of -1 for the extra log buffer used during splits
> > +	 * so that it will trigger errors if we ever try to do IO on it without
> > +	 * first having set it up properly.
> > +	 */
> >  	error = -ENOMEM;
> > -	bp = xfs_buf_alloc(mp->m_logdev_targp, 0, BTOBB(log->l_iclog_size), 0);
> > +	bp = xfs_buf_alloc(mp->m_logdev_targp, -1LL, BTOBB(log->l_iclog_size), 0);
> 
> How about XFS_BUF_DADDR_NULL (here and above)?

Will fix.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs