-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Trond Myklebust wrote: > On Sat, 2010-05-22 at 09:18 -0700, Stuart Sheldon wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA256 >> >> On 05/22/2010 09:09 AM, Trond Myklebust wrote: >>> Do you see any more NFS traffic to the server when the above hang >>> occurs? I'm wondering if we don't need something like the following >>> patch. >>> >>> Cheers >>> Trond >>> -------------------------------------------------------------------------------- >>> From 0b574497e05f62fd49cfe26f1b97e3669525446c Mon Sep 17 00:00:00 2001 >>> From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> >>> Date: Sat, 22 May 2010 11:49:19 -0400 >>> Subject: [PATCH] NFS: Ensure that we mark the inode as dirty if we exit early from commit >>> >>> If we exit from nfs_commit_inode() without ensuring that the COMMIT rpc >>> call has been completed, we must re-mark the inode as dirty. Otherwise, >>> future calls to sync_inode() with the WB_SYNC_ALL flag set will fail to >>> ensure that the data is on the disk. >>> >>> Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> >>> Cc: stable@xxxxxxxxxx >>> --- >>> fs/nfs/write.c | 13 +++++++++++-- >>> 1 files changed, 11 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/nfs/write.c b/fs/nfs/write.c >>> index 3aea3ca..b8a6d7a 100644 >>> --- a/fs/nfs/write.c >>> +++ b/fs/nfs/write.c >>> @@ -1386,7 +1386,7 @@ static int nfs_commit_inode(struct inode *inode, int how) >>> int res = 0; >>> >>> if (!nfs_commit_set_lock(NFS_I(inode), may_wait)) >>> - goto out; >>> + goto out_mark_dirty; >>> spin_lock(&inode->i_lock); >>> res = nfs_scan_commit(inode, &head, 0, 0); >>> spin_unlock(&inode->i_lock); >>> @@ -1398,9 +1398,18 @@ static int nfs_commit_inode(struct inode *inode, int how) >>> wait_on_bit(&NFS_I(inode)->flags, NFS_INO_COMMIT, >>> nfs_wait_bit_killable, >>> TASK_KILLABLE); >>> + else >>> + goto out_mark_dirty; >>> } else >>> nfs_commit_clear_lock(NFS_I(inode)); >>> -out: >>> + return res; >>> + /* Note: If we exit without ensuring that the commit is complete, >>> + * we must mark the inode as dirty. Otherwise, future calls to >>> + * sync_inode() with the WB_SYNC_ALL flag set will fail to ensure >>> + * that the data is on the disk. >>> + */ >>> +out_mark_dirty: >>> + __mark_inode_dirty(inode, I_DIRTY_DATASYNC); >>> return res; >>> } >>> >> Trond, >> >> When it occurred, it continues to throw those errors in the log, and all >> access to the NFS mount stalled until I hard reset the client system. >> >> Do you want me to apply the patch and see if I can recreate the condition? > > Yes, please do. Could you also apply the following debugging patch on > top of the above one, and see if the WARN_ON() triggers when both > patches are applied? > > Cheers > Trond > ------------------------------------------------------------------------------------------------ > From 9883e35957468987f4338525c1d800d637bc05b7 Mon Sep 17 00:00:00 2001 > From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> > Date: Sat, 22 May 2010 10:46:41 -0400 > Subject: [PATCH 2/2] NFS: debugging code for nfs_wb_page() > > Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx> > --- > fs/nfs/write.c | 9 +++++++++ > 1 files changed, 9 insertions(+), 0 deletions(-) > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > index b8a6d7a..0558fab 100644 > --- a/fs/nfs/write.c > +++ b/fs/nfs/write.c > @@ -1519,12 +1519,21 @@ int nfs_wb_page(struct inode *inode, struct page *page) > int ret; > > while(PagePrivate(page)) { > + unsigned dirty; > + int syncing; > + > wait_on_page_writeback(page); > if (clear_page_dirty_for_io(page)) { > ret = nfs_writepage_locked(page, &wbc); > if (ret < 0) > goto out_error; > + continue; > } > + > + dirty = inode->i_state & I_DIRTY; > + syncing = inode->i_state & I_SYNC; > + WARN_ON(!syncing && !dirty && PagePrivate(page)); > + > ret = sync_inode(inode, &wbc); > if (ret < 0) > goto out_error; The problem seems to be fixed with this, but I'm not seeing / don't know where to find the 'WARN_ON' messages. If they are suppose to be in the syslog, then there weren't any. I'm rolling back to the unpatched kernel to verify that I can still reproduce the problem natively. Will follow up on Monday. Stu - -- If you took all the girls I knew When I was single And brought them all together for one night I know theyd never match My sweet imagination And everything looks worse in black and white -- Paul Simon - "Kodachrome Lyrics" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iQIcBAEBCAAGBQJL+WPYAAoJEFKVLITDJSGSU0sP/jdJt9NiliGFJ0IB3I4Pt6o/ aHI5wzWWG8Uxzn5UXBumEp79hWXZ8D79kLZ4L8zh9/9hpi8rbExz1ci9IJmjU1LW qWQVDqmv36uKX9YUzmj8d4505G0Czf9BkU6vsOy4elZ4pAy/Q9EXKFVS5mtirO7u 9FeYJebvbhvdICJTaLDbryugpxWYV6P6bGVglowdbqVWBnKo5QXevWnm6s3Lc1Jd girpqkQ2f4NddfeW1TbITtBr0bEPYuhK4s4XMdWiYIHNIaRSBJDF5Hlues8LWxu2 ++4xz1G7n/K59hRBRX+giBGaSXXl/GSGib87RfwSCrg5qEytNbSRKQX0WuFFxARS tTbU+zwDpUF7SSvYJZGDh2EEPr2QNfOVCCxVmf1Oe4JAs0OJku1z1ReKpr+CoZg2 lgIFl59bPBNjMcx8GNynnJgTW1IMXWsM8UpTpiAfwTpffXaW3YEH2V855Px9Mkqt ONuvCll3CWEbwdmisWqvqRAix7oHNh4VMnDOTfb1eYf5ytw5mLfxZaznXhwL8FjE pUvHZG4TRMUENjucs38dvwx4Vx63DEdxMSK5C0GpdsI16xh0hMKa3ohWaVtSgwIE Emf1HU2G0vdxl+zMI1IyerNp1T+oxu1rr7eOYWzl3HO8bQc+9ua7yCntpni/c7Dz Ge8hUPZTZAVmFRRy9zBO =2+R2 -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html