Hi Mark and Chandra, On Fri, Apr 26, 2013 at 10:32:34AM -0500, Mark Tinguely wrote: > On 04/25/13 17:41, Chandra Seetharaman wrote: > >In which case something along the lines of > > > >--- > >diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c > >index 3806088..3fb2fa6 100644 > >--- a/fs/xfs/xfs_mount.c > >+++ b/fs/xfs/xfs_mount.c > >@@ -203,7 +203,13 @@ xfs_perag_get(struct xfs_mount *mp, xfs_agnumber_t > >agno) > > if (pag) { > > ASSERT(atomic_read(&pag->pag_ref)>= 0); > > ref = atomic_inc_return(&pag->pag_ref); > >- } > >+ } else > >+ /* > >+ * xfs_perag_get() is called with invalid agno, > >+ * which cannot happen. This indicates a problem > >+ * in the calling code. > >+ */ > >+ BUG(); > > rcu_read_unlock(); > > trace_xfs_perag_get(mp, agno, ref, _RET_IP_); > > return pag; > >-------- > > > >would be useful ?. Since we have a NULL pag, we will trip somewhere > >else. At least with this, there is a pointer to the debugger/sysadmin > >about where/what to look for (may be with more valuable/correct comment > >than above). > > > > We will have to make sure the callers of xfs_perag_get() handle the NULL > before dereferencing it. Sometimes the NULL is normal and just means the > perag structure has not been initialize yet. > > Properly handling the NULL from xfs_perag_get() in the caller will also > mean that the callers of the callers of xfs_perag_get() have to handle > the NULL returned to them. I will come back to this once the CRC stuff > has been put to rest. I agree that we want to address this. Our worst case should be a forced shutdown, rather than a NULL ptr deref, or a BUG(). Ideally one corrupted filesystem does not result in a full system outage, right? ;) There are some others like this. e.g. xfs_da_read_buf can return 0 with a null buffer pointer, and we rarely check for that before using bp. -Ben _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs