On May 22, 2012, at 4:40 PM, Myklebust, Trond wrote: > On Tue, 2012-05-22 at 14:07 -0400, Andy Adamson wrote: >> On Tue, May 22, 2012 at 1:46 PM, Myklebust, Trond >> <Trond.Myklebust@xxxxxxxxxx> wrote: >>> On Tue, 2012-05-22 at 16:30 +0000, Adamson, Andy wrote: >>>> On May 22, 2012, at 12:24 PM, Myklebust, Trond wrote: >>>> >>>>> On Tue, 2012-05-22 at 08:09 -0400, andros@xxxxxxxxxx wrote: >>>>>> From: Andy Adamson <andros@xxxxxxxxxx> >>>>>> >>>>>> Signed-off-by: Andy Adamson <andros@xxxxxxxxxx> >>>>>> --- >>>>>> fs/nfs/write.c | 2 ++ >>>>>> 1 files changed, 2 insertions(+), 0 deletions(-) >>>>>> >>>>>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c >>>>>> index e6fe3d6..c7295de 100644 >>>>>> --- a/fs/nfs/write.c >>>>>> +++ b/fs/nfs/write.c >>>>>> @@ -1555,6 +1555,8 @@ static void nfs_commit_release_pages(struct nfs_commit_data *data) >>>>>> /* We have a mismatch. Write the page again */ >>>>>> dprintk(" mismatch\n"); >>>>>> nfs_mark_request_dirty(req); >>>>>> + nfs_unlock_request(req); >>>>>> + continue; >>>>>> next: >>>>>> nfs_unlock_and_release_request(req); >>>>>> } >>>>> >>>>> What is this patch trying to fix? As far as I can see it will lead to a >>>>> reference leak. >>>> >>>> The release of the page drops the refcount to zero. Since it is a mismatch, we want to continue to use the page, so nfs_page_find_request_locked is called we get the WARNING in kref_get, and an Oops in various places in pnfs_update_layout. >>>> >>>> Here is the output of added printk's. >>>> >>>> -->Andy >>>> >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988921] NFS: commit (0:35/2 4096@184320) >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988923] mismatch req ffff880042dbb800 >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988927] nfs_release_request WB req ffff880042dbb800 wb_kref 1 >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988932] put_lseg: lseg ffff880042d95380 ref 4 valid 0 >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988945] nfs_page_find_request_locked req ffff88003e033800 >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988949] nfs_page_find_request_locked WB req ffff88003e033800 wb_kref 0 >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.988953] ------------[ cut here ]------------ >>>> May 21 18:25:33 fedora-64-2 kernel: [ 152.989082] WARNING: at include/linux/kref.h:41 kref_get+0x20/0x2c [nfs]() >>> >>> Are these commit-to-ds requests? >> >> Yes > > OK, so how about the following fix: Yep - works just fine. Thanks! -->Andy > > 8<--------------------------------------------------------------- > From 53b8ee346463946f88b3e1639d688c384df1166c Mon Sep 17 00:00:00 2001 > From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> > Date: Tue, 22 May 2012 16:36:27 -0400 > Subject: [PATCH] NFSv4.1: Fix a bad reference count issue in the pNFS commit > code > > filelayout_scan_commit_lists needs to bump the reference count on > the struct nfs_page just like nfs_scan_commit_list(). > > Reported-by: Andy Adamson <andros@xxxxxxxxxx> > Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> > --- > fs/nfs/nfs4filelayout.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/fs/nfs/nfs4filelayout.c b/fs/nfs/nfs4filelayout.c > index 474c630..33849d3 100644 > --- a/fs/nfs/nfs4filelayout.c > +++ b/fs/nfs/nfs4filelayout.c > @@ -1106,6 +1106,7 @@ transfer_commit_list(struct list_head *src, struct list_head *dst, > list_for_each_entry_safe(req, tmp, src, wb_list) { > if (!nfs_lock_request(req)) > continue; > + kref_get(&req->wb_kref); > if (cond_resched_lock(cinfo->lock)) > list_safe_reset_next(req, tmp, wb_list); > nfs_request_remove_commit_list(req, cinfo); > -- > 1.7.7.6 > > > -- > Trond Myklebust > Linux NFS client maintainer > > NetApp > Trond.Myklebust@xxxxxxxxxx > www.netapp.com > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html