In which case something along the lines of --- diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 3806088..3fb2fa6 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -203,7 +203,13 @@ xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t agno) if (pag) { ASSERT(atomic_read(&pag->pag_ref) >= 0); ref = atomic_inc_return(&pag->pag_ref); - } + } else + /* + * xfs_perag_get() is called with invalid agno, + * which cannot happen. This indicates a problem + * in the calling code. + */ + BUG(); rcu_read_unlock(); trace_xfs_perag_get(mp, agno, ref, _RET_IP_); return pag; -------- would be useful ?. Since we have a NULL pag, we will trip somewhere else. At least with this, there is a pointer to the debugger/sysadmin about where/what to look for (may be with more valuable/correct comment than above). On Wed, 2013-04-24 at 06:49 +1000, Dave Chinner wrote: > On Tue, Apr 23, 2013 at 10:54:35AM -0500, Chandra Seetharaman wrote: > > On Tue, 2013-04-23 at 08:48 -0500, Mark Tinguely wrote: > > > On 04/22/13 18:30, Dave Chinner wrote: > > > > On Mon, Apr 22, 2013 at 10:11:39AM -0500, Mark Tinguely wrote: > > > >> #6 [ffff880135603980] _xfs_buf_find at ffffffffa01a7fef [xfs] > > > >> #7 [ffff8801356039f0] xfs_buf_get at ffffffffa01a824a [xfs] > > > >> #8 [ffff880135603a30] xfs_buf_read at ffffffffa01a83a4 [xfs] > > > >> #9 [ffff880135603a60] xlog_recover_inode_pass2 at ffffffffa0193629 [xfs] > > > > > > > > So it's the same problem as this bug fix addresses: > > > > > > > > commit 10616b806d1d7835b1d23b8d75ef638f92cb98b6 > > > > Author: Dave Chinner<dchinner@xxxxxxxxxx> > > > > Date: Mon Jan 21 23:53:52 2013 +1100 > > > > > > > > xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end > > > > > > > > When _xfs_buf_find is passed an out of range address, it will fail > > > > to find a relevant struct xfs_perag and oops with a null > > > > dereference. This can happen when trying to walk a filesystem with a > > > > metadata inode that has a partially corrupted extent map (i.e. the > > > > block number returned is corrupt, but is otherwise intact) and we > > > > try to read from the corrupted block address. > > > > > > > > In this case, just fail the lookup. If it is readahead being issued, > > > > it will simply not be done, but if it is real read that fails we > > > > will get an error being reported. Ideally this case should result > > > > in an EFSCORRUPTED error being reported, but we cannot return an > > > > error through xfs_buf_read() or xfs_buf_get() so this lookup failure > > > > may result in ENOMEM or EIO errors being reported instead. > > > > > > > > Signed-off-by: Dave Chinner<dchinner@xxxxxxxxxx> > > > > Reviewed-by: Brian Foster<bfoster@xxxxxxxxxx> > > > > Reviewed-by: Ben Myers<bpm@xxxxxxx> > > > > Signed-off-by: Ben Myers<bpm@xxxxxxx> > > > > > > > >> The recovery value is bad and is a problem on its own, but XFS does > > > >> not verify the validity of ag number when doing a xfs_perag_get(). > > I'd be interested to know why the inode in recovery is bad - is this > on a kernel that CRCs the log records? Or a result of some other bug > or hardware corruption? i.e. xfs_perag_get is not the problem here, > it's a corruption of a trusted inode number and we failed to detect > that corruption.... > > > > > Right, that's what the above fix does, but it can't be done on older > > > > kernels because grwofs relies on being able to get buffers beyond > > > > the existing filesystem limits... > > > > > > Thank-you, that make sense. > > > > > > I still do not like assuming xfs_perag_get() will always return a valid > > > perag pointer. > > > > I second that. > > > > Is there any reason we should _not_ check the return value from > > xfs_perag_get() for NULL ? > > Yes. The input AG should already be bounds checked before the perag > is looked up. If we are asking for an invalid AG, then the bug is not > in xfs_perag_get(), it is in the code that is calling it. i.e. error > checking the xfs_perag_get() function is a band-aid for improper > object validation, not a solution to the problem. > > That is, this function was designed to be extremely low overhead and > only to be handed validated data. If it is only handed validated > data, then it is guaranteed to return a valid per-ag structure, > and therefore error checking the return value is not necessary. > > Because xfs_perag_get is not designed to handle untrusted data it is > up to the calling code to first validate the AGNO that is passed to > xfs_perag_get(). If we aren't first validating the object that the > AGNO is derived from, then the calling code has failed to validate > it's inputs sufficiently, and lots of other things can go wrong (not > just the xfs_perag_get() call). > > For example, the above commit is a catchall for bad block numbers > being looked up in extent records. It was a quick fix, not a > targeted fix for the reported problems. For bad block numbers in > extents, we should be doing is validating block numbers when they > are looked up are within the filesystem bounds (eg. inside > xfs_bmapi_read/xfs_bmapi_write) so that a bad block number is caught > at lookup time, not at IO time. We only do that for extents that > point to block 0. > > Cheers, > > Dave. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs