Re: [PATCH] xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 8 Jan 2013 09:41:54 +1100

On Mon, Jan 07, 2013 at 09:10:12AM -0600, Mark Tinguely wrote:
> On 01/03/13 16:02, Dave Chinner wrote:
> >On Thu, Jan 03, 2013 at 03:22:22PM -0600, Ben Myers wrote:
> >>Dave,
> >>
> >>On Wed, Dec 19, 2012 at 09:43:45AM +1100, Dave Chinner wrote:
> >>>From: Dave Chinner<dchinner@xxxxxxxxxx>
> >>>
> >>>When _xfs_buf_find is passed an out of range address, it will fail
> >>>to find a relevant struct xfs_perag and oops with a null
> >>>dereference. This can happen when trying to walk a filesystem with a
> >>>metadata inode that has a partially corrupted extent map (i.e. the
> >>>block number returned is corrupt, but is otherwise intact) and we
> >>>try to read from the corrupted block address.
> >>>
> >>>In this case, just fail the lookup. If it is readahead being issued,
> >>>it will simply not be done, but if it is real read that fails we
> >>>will get an error being reported.  Ideally this case should result
> >>>in an EFSCORRUPTED error being reported, but we cannot return an
> >>>error through xfs_buf_read() or xfs_buf_get() so this lookup failure
> >>>may result in ENOMEM or EIO errors being reported instead.
> >>>
> >>>Signed-off-by: Dave Chinner<dchinner@xxxxxxxxxx>
> >>>---
> >>>  fs/xfs/xfs_buf.c |   18 ++++++++++++++++++
> >>>  1 file changed, 18 insertions(+)
> >>>
> >>>diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> >>>index a80195b..16249d9 100644
> >>>--- a/fs/xfs/xfs_buf.c
> >>>+++ b/fs/xfs/xfs_buf.c
> >>>@@ -487,6 +487,7 @@ _xfs_buf_find(
> >>>  	struct rb_node		*parent;
> >>>  	xfs_buf_t		*bp;
> >>>  	xfs_daddr_t		blkno = map[0].bm_bn;
> >>>+	xfs_daddr_t		eofs;
> >>>  	int			numblks = 0;
> >>>  	int			i;
> >>>
> >>>@@ -498,6 +499,23 @@ _xfs_buf_find(
> >>>  	ASSERT(!(numbytes<  (1<<  btp->bt_sshift)));
> >>>  	ASSERT(!(BBTOB(blkno)&  (xfs_off_t)btp->bt_smask));
> >>>
> >>>+	/*
> >>>+	 * Corrupted block numbers can get through to here, unfortunately, so we
> >>>+	 * have to check that the buffer falls within the filesystem bounds.
> >>>+	 */
> >>>+	eofs = XFS_FSB_TO_BB(btp->bt_mount, btp->bt_mount->m_sb.sb_dblocks);
> >>>+	if (blkno>= eofs || blkno + numblks>  eofs) {
> >>			^^^^^^^^^^^^^^^^^^^^^^
> >>
> >>That looks suspect to me.  I think you need to go over each buffer
> >>individually.
> >
> >I'm not trying to validate every single part of a buffer here -
> >there is no need to do that as the block numbers are validated
> >against device overruns during IO. i.e. we'll get an EIO and a log
> >message telling us an attempt to access beyond the end of the device
> >occurring during IO.
> >
> >I.e. we aren't doing validity checks on whether a buffer has a sane
> >block number or not (that's up to the caller), what we are
> >avoiding is attempting to look up a buffer that is outside of the
> >range of the cache indexing. i.e. it's validating the cache index we
> >are about to use, not passing judgement on whether the caller has
> >asked for a valid set of blocks or not.
> 
> I did not like the second part of the if statement because first
> block number in a "discontiguous" buffer does not have to be the
> lowest block number.

Again, _xfs_buf_find() does not care about secondary blocks in
discontiguous buffers. It cares about the parameters of the cached
items - which are described by the {blkno, numblks} tuple - and
doesn't care about sub-block ranges in discontiguous buffers at all.

> The first half of the if statement alone would prevent the oops.

Right, that fixes the initial cache index lookup problem - I wanted
to validate the entire tuple. But you're right in that discontigous
buffers could have blkno + numblks beyond EOFS so validating cache
item ranges like this is wrong. I'll drop that check.

Cheers,

Dave.

-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs