On Mon, Jan 07, 2013 at 09:10:12AM -0600, Mark Tinguely wrote: > On 01/03/13 16:02, Dave Chinner wrote: > >On Thu, Jan 03, 2013 at 03:22:22PM -0600, Ben Myers wrote: > >>Dave, > >> > >>On Wed, Dec 19, 2012 at 09:43:45AM +1100, Dave Chinner wrote: > >>>From: Dave Chinner<dchinner@xxxxxxxxxx> > >>> > >>>When _xfs_buf_find is passed an out of range address, it will fail > >>>to find a relevant struct xfs_perag and oops with a null > >>>dereference. This can happen when trying to walk a filesystem with a > >>>metadata inode that has a partially corrupted extent map (i.e. the > >>>block number returned is corrupt, but is otherwise intact) and we > >>>try to read from the corrupted block address. > >>> > >>>In this case, just fail the lookup. If it is readahead being issued, > >>>it will simply not be done, but if it is real read that fails we > >>>will get an error being reported. Ideally this case should result > >>>in an EFSCORRUPTED error being reported, but we cannot return an > >>>error through xfs_buf_read() or xfs_buf_get() so this lookup failure > >>>may result in ENOMEM or EIO errors being reported instead. > >>> > >>>Signed-off-by: Dave Chinner<dchinner@xxxxxxxxxx> > >>>--- > >>> fs/xfs/xfs_buf.c | 18 ++++++++++++++++++ > >>> 1 file changed, 18 insertions(+) > >>> > >>>diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > >>>index a80195b..16249d9 100644 > >>>--- a/fs/xfs/xfs_buf.c > >>>+++ b/fs/xfs/xfs_buf.c > >>>@@ -487,6 +487,7 @@ _xfs_buf_find( > >>> struct rb_node *parent; > >>> xfs_buf_t *bp; > >>> xfs_daddr_t blkno = map[0].bm_bn; > >>>+ xfs_daddr_t eofs; > >>> int numblks = 0; > >>> int i; > >>> > >>>@@ -498,6 +499,23 @@ _xfs_buf_find( > >>> ASSERT(!(numbytes< (1<< btp->bt_sshift))); > >>> ASSERT(!(BBTOB(blkno)& (xfs_off_t)btp->bt_smask)); > >>> > >>>+ /* > >>>+ * Corrupted block numbers can get through to here, unfortunately, so we > >>>+ * have to check that the buffer falls within the filesystem bounds. > >>>+ */ > >>>+ eofs = XFS_FSB_TO_BB(btp->bt_mount, btp->bt_mount->m_sb.sb_dblocks); > >>>+ if (blkno>= eofs || blkno + numblks> eofs) { > >> ^^^^^^^^^^^^^^^^^^^^^^ > >> > >>That looks suspect to me. I think you need to go over each buffer > >>individually. > > > >I'm not trying to validate every single part of a buffer here - > >there is no need to do that as the block numbers are validated > >against device overruns during IO. i.e. we'll get an EIO and a log > >message telling us an attempt to access beyond the end of the device > >occurring during IO. > > > >I.e. we aren't doing validity checks on whether a buffer has a sane > >block number or not (that's up to the caller), what we are > >avoiding is attempting to look up a buffer that is outside of the > >range of the cache indexing. i.e. it's validating the cache index we > >are about to use, not passing judgement on whether the caller has > >asked for a valid set of blocks or not. > > I did not like the second part of the if statement because first > block number in a "discontiguous" buffer does not have to be the > lowest block number. Again, _xfs_buf_find() does not care about secondary blocks in discontiguous buffers. It cares about the parameters of the cached items - which are described by the {blkno, numblks} tuple - and doesn't care about sub-block ranges in discontiguous buffers at all. > The first half of the if statement alone would prevent the oops. Right, that fixes the initial cache index lookup problem - I wanted to validate the entire tuple. But you're right in that discontigous buffers could have blkno + numblks beyond EOFS so validating cache item ranges like this is wrong. I'll drop that check. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs