Re: [PATCH 06/11] xfs_repair: check v5 filesystem attr block header sanity

"Darrick J. Wong" <darrick.wong@xxxxxxxxxx> · Tue, 25 Aug 2015 21:50:43 -0700

On Wed, Aug 26, 2015 at 11:15:20AM +1000, Dave Chinner wrote:
> On Tue, Aug 25, 2015 at 05:59:11PM -0700, Darrick J. Wong wrote:
> > On Wed, Aug 26, 2015 at 10:45:02AM +1000, Dave Chinner wrote:
> > > On Tue, Aug 25, 2015 at 05:32:59PM -0700, Darrick J. Wong wrote:
> > > > Check the v5 fields (uuid, blocknr, owner) of attribute blocks for
> > > > obvious errors while scanning xattr blocks.  If the ownership info
> > > > is incorrect, kill the block.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > > 
> > > Why hasn't the buffer verifier done this validation?
> > 
> > Maybe I'm confused here, so here's what I think is going on:
> > 
> > AFAICT most of the verifiers do things like this:
> > 
> > if (crcs_enabled && cksum_verification fails) {
> > 	xfs_buf_ioerror(bp, -EFSBADCRC);
> > } else if (header_is_insane) {
> > 	xfs_buf_ioerror(bp, -EFSCORRUPTED);
> > }
> > 
> > The fuzzer corrupts the UUID without updating the CRC.  The verifier first
> > checks the CRC and it doesn't match, so it sets b_error to -EFSBADCRC and
> > doesn't get to the header check. 
> 
> Ok, that explains it - I didn't consider that case. This would seem
> like a general problem for repair when CRC errors are detected? i.e.
> we set the repair flag without doing the remaining verifier validity
> checks?

Heh, this seems like a moderately large refactoring project... :)

AFAICT, repair is at least checking some of that stuff for bmap,
directory, and symlink blocks, but as you point out it's scattered all
over the place.

It looks like the verifiers can easily check the magic, uuid, and
block fields; but how would we get the owner info to the verifier?

What if we add a "u64 owner" to xfs_buf and change all functions that
grab a buffer to take a u64 owner argument?  That'd be rather a lot of
things to change between the kernel and xfsprogs, but otoh we'd gain
owner checking and centralize verification of the v5 fields.  I think
we'd have to change xfs_trans_{get,read}_buf*() in the kernel and
libxfs_{get,read}buf*() in xfsprogs.

Then we'd rework the verifiers as such:

if (header_is_insane)
	xfs_buf_ioerror(bp, -EFSCORRUPTED);
else if (crcs and crc_fails)
	xfs_buf_ioerror(bp, -EFSBADCRC);

Since the header being garbage seems like a much more severe error
than only the checksum being wrong.  Then we'd always know when the
v5 fields don't match our expectations.

Next we'd change repair to do something like this:

bp = libxfs_readbuf(...);
if (!bp) {
	/* couldn't get a buffer, abort */
	return;
} else if (bp->b_error == -EFSBADCRC) {
	/* just a bad crc; see if the rest of the block is ok */
	repair++;
} else if (bp->b_error) {
	/* io error or corrupt header; toss out the owner */
	toss_owner();
	libxfs_putbuf(bp);
	return;
}

/* more checks... */

if (repair)
	libxfs_writebuf(bp);

How's that sound?

--D

> 
> As it is, I don't really like duplicating the verifier checks in
> repair. ISTR I recently suggested that we need to factor all the
> common verifier checks (magic, owner, uuid, blockno) into a single
> function that all verifiers called to remove all the code
> duplication. If we do this, then repair can also call the function
> to verify headers after a CRC failure to determine if repair is
> possible....
> 
> This is a bit more work, so I'll probably take this specific patch
> for 4.2.0, but I'd like to see this all factored out so we aren't
> duplicating code unnecessarily.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@xxxxxxxxxxxxx
> 
> _______________________________________________
> xfs mailing list
> xfs@xxxxxxxxxxx
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs