On 05/26/2012 12:24 AM, Sage Weil wrote: > Hi Yan, > > On Fri, 25 May 2012, Yan, Zheng wrote: >> I got lots of NULL pointer dereference Oops when compiling kernel on ceph. >> The bug is because the kernel page migration routine replaces some pages >> in the page cache with new pages, these new pages' private can be non-zero. >> >> Signed-off-by: Zheng Yan <zheng.z.yan@xxxxxxxxx> >> --- >> fs/ceph/addr.c | 5 ++++- >> 1 files changed, 4 insertions(+), 1 deletions(-) >> >> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c >> index 173b1d2..1f180b1 100644 >> --- a/fs/ceph/addr.c >> +++ b/fs/ceph/addr.c >> @@ -984,7 +984,10 @@ retry_locked: >> BUG_ON(!ci->i_snap_realm); >> down_read(&mdsc->snap_rwsem); >> BUG_ON(!ci->i_snap_realm->cached_context); >> - snapc = (void *)page->private; >> + if (PagePrivate(page)) >> + snapc = (void *)page->private; >> + else >> + snapc = NULL; >> if (snapc && snapc != ci->i_head_snapc) { >> /* >> * this page is already dirty in another (older) snap >> -- > > This looks right for this case. What I'm less sure about is whether it's > sufficient. I take it the migration only happens on clean pages? (Unless > ->migrate is implemented?) Yes, migration only happens on pages that are clean and have no private stuff. > > Before we were assuming private == NULL on any clean page, independent of > PG_private. For example, ceph_invalidatepage() and ceph_releasepage() > have BUG_ONs that are probably wrong. writepage_nounlock() has a sanity > check that should probably be strengthened/guarded with a PagePrivate() > check, too. > Will send the v2 patch soon. Regards Yan, Zheng -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html