On Tue, 2012-05-22 at 14:07 -0400, Andy Adamson wrote: > On Tue, May 22, 2012 at 1:46 PM, Myklebust, Trond > <Trond.Myklebust@xxxxxxxxxx> wrote: > > On Tue, 2012-05-22 at 16:30 +0000, Adamson, Andy wrote: > >> On May 22, 2012, at 12:24 PM, Myklebust, Trond wrote: > >> > >> > On Tue, 2012-05-22 at 08:09 -0400, andros@xxxxxxxxxx wrote: > >> >> From: Andy Adamson <andros@xxxxxxxxxx> > >> >> > >> >> Signed-off-by: Andy Adamson <andros@xxxxxxxxxx> > >> >> --- > >> >> fs/nfs/write.c | 2 ++ > >> >> 1 files changed, 2 insertions(+), 0 deletions(-) > >> >> > >> >> diff --git a/fs/nfs/write.c b/fs/nfs/write.c > >> >> index e6fe3d6..c7295de 100644 > >> >> --- a/fs/nfs/write.c > >> >> +++ b/fs/nfs/write.c > >> >> @@ -1555,6 +1555,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data) > >> >> /* We have a mismatch. Write the page again */ > >> >> dprintk(" mismatch\n"); > >> >> nfs_mark_request_dirty(req); > >> >> + nfs_unlock_request(req); > >> >> + continue; > >> >> next: > >> >> nfs_unlock_and_release_request(req); > >> >> } > >> > > >> > What is this patch trying to fix? As far as I can see it will lead to a > >> > reference leak. > >> > >> The release of the page drops the refcount to zero. Since it is a mismatch, we want to continue to use the page, so nfs_page_find_request_locked is called we get the WARNING in kref_get, and an Oops in various places in pnfs_update_layout. > >> > >> Here is the output of added printk's. > >> > >> -->Andy > >> > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.988921] NFS: commit (0:35/2 4096@184320) > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.988923] mismatch req ffff880042dbb800 > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.988927] nfs_release_request WB req ffff880042dbb800 wb_kref 1 > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.988932] put_lseg: lseg ffff880042d95380 ref 4 valid 0 > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.988945] nfs_page_find_request_locked req ffff88003e033800 > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.988949] nfs_page_find_request_locked WB req ffff88003e033800 wb_kref 0 > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.988953] ------------[ cut here ]------------ > >> May 21 18:25:33 fedora-64-2 kernel: [ 152.989082] WARNING: at include/linux/kref.h:41 kref_get+0x20/0x2c [nfs]() > > > > Are these commit-to-ds requests? > > Yes OK, so how about the following fix: 8<--------------------------------------------------------------- >From 53b8ee346463946f88b3e1639d688c384df1166c Mon Sep 17 00:00:00 2001 From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> Date: Tue, 22 May 2012 16:36:27 -0400 Subject: [PATCH] NFSv4.1: Fix a bad reference count issue in the pNFS commit code filelayout_scan_commit_lists needs to bump the reference count on the struct nfs_page just like nfs_scan_commit_list(). Reported-by: Andy Adamson <andros@xxxxxxxxxx> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> --- fs/nfs/nfs4filelayout.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c index 474c630..33849d3 100644 --- a/fs/nfs/nfs4filelayout.c +++ b/fs/nfs/nfs4filelayout.c @@ -1106,6 +1106,7 @@ transfer_commit_list(struct list_head *src, struct list_head *dst, list_for_each_entry_safe(req, tmp, src, wb_list) { if (!nfs_lock_request(req)) continue; + kref_get(&req->wb_kref); if (cond_resched_lock(cinfo->lock)) list_safe_reset_next(req, tmp, wb_list); nfs_request_remove_commit_list(req, cinfo); -- 1.7.7.6 -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥