On Tue, 2019-03-12 at 20:12 +0000, Trond Myklebust wrote: > On Tue, 2019-03-12 at 20:04 +0000, Schumaker, Anna wrote: > > Hi Trond, > > > > I'm seeing a hang when testing xfstests generic/013 on v4.1 with pNFS > > after this > > patch: > > > > On Wed, 2018-09-05 at 14:07 -0400, Trond Myklebust wrote: > > > If someone interrupts a wait on one or more outstanding layoutgets > > > in > > > pnfs_update_layout() then return the ERESTARTSYS/EINTR error. > > > > > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > --- > > > fs/nfs/pnfs.c | 26 ++++++++++++++++---------- > > > 1 file changed, 16 insertions(+), 10 deletions(-) > > > > > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c > > > index e8f232de484f..7d9a51e6b847 100644 > > > --- a/fs/nfs/pnfs.c > > > +++ b/fs/nfs/pnfs.c > > > @@ -1740,16 +1740,16 @@ static bool pnfs_within_mdsthreshold(struct > > > nfs_open_context *ctx, > > > return ret; > > > } > > > > > > -static bool pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr > > > *lo) > > > +static int pnfs_prepare_to_retry_layoutget(struct pnfs_layout_hdr > > > *lo) > > > { > > > /* > > > * send layoutcommit as it can hold up layoutreturn due to lseg > > > * reference > > > */ > > > pnfs_layoutcommit_inode(lo->plh_inode, false); > > > - return !wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN, > > > + return wait_on_bit_action(&lo->plh_flags, NFS_LAYOUT_RETURN, > > > nfs_wait_bit_killable, > > > - TASK_UNINTERRUPTIBLE); > > > + TASK_KILLABLE); > > > } > > > > > > static void nfs_layoutget_begin(struct pnfs_layout_hdr *lo) > > > @@ -1830,7 +1830,9 @@ pnfs_update_layout(struct inode *ino, > > > } > > > > > > lookup_again: > > > - nfs4_client_recover_expired_lease(clp); > > > + lseg = ERR_PTR(nfs4_client_recover_expired_lease(clp)); > > > + if (IS_ERR(lseg)) > > > + goto out; > > > first = false; > > > spin_lock(&ino->i_lock); > > > lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags); > > > @@ -1863,9 +1865,9 @@ pnfs_update_layout(struct inode *ino, > > > if (list_empty(&lo->plh_segs) && > > > atomic_read(&lo->plh_outstanding) != 0) { > > > spin_unlock(&ino->i_lock); > > > - if (wait_var_event_killable(&lo->plh_outstanding, > > > - atomic_read(&lo- > > > > plh_outstanding) == 0 > > > - || !list_empty(&lo->plh_segs))) > > > + lseg = ERR_PTR(wait_var_event_killable(&lo- > > > > plh_outstanding, > > > + atomic_read(&lo- > > > > plh_outstanding))); > > > + if (IS_ERR(lseg) || !list_empty(&lo->plh_segs)) > > > > Was dropping the "== 0" condition attached to the atomic_read() here > > a mistake? > > I think what's happening is that my client is waiting for > > plh_outstanding to be > > anything other than 0 when there isn't any work left to do. > > Yes. That's a bug. How about the following patch? This patch works for me, but for some reason doing "!atomic_read()" takes 8 minutes longer to complete compared to doing "atomic_read() == 0". I have not run this multiple times to confirm that it's always the case. Anna > > 8<--------------------------------------------------- > From 400417b05f3ec0531544ca5f94e64d838d8b8849 Mon Sep 17 00:00:00 2001 > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > Date: Tue, 12 Mar 2019 16:04:51 -0400 > Subject: [PATCH] pNFS: Fix a typo in pnfs_update_layout > > We're supposed to wait for the outstanding layout count to go to zero, > but that got lost somehow. > > Fixes: d03360aaf5cca ("pNFS: Ensure we return the error if someone...") > Reported-by: Anna Schumaker <Anna.Schumaker@xxxxxxxxxx> > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > --- > fs/nfs/pnfs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c > index 8247bd1634cb..7066cd7c7aff 100644 > --- a/fs/nfs/pnfs.c > +++ b/fs/nfs/pnfs.c > @@ -1889,7 +1889,7 @@ pnfs_update_layout(struct inode *ino, > atomic_read(&lo->plh_outstanding) != 0) { > spin_unlock(&ino->i_lock); > lseg = ERR_PTR(wait_var_event_killable(&lo->plh_outstanding, > - atomic_read(&lo->plh_outstanding))); > + !atomic_read(&lo->plh_outstanding))); > if (IS_ERR(lseg) || !list_empty(&lo->plh_segs)) > goto out_put_layout_hdr; > pnfs_put_layout_hdr(lo); > -- > 2.20.1 > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@xxxxxxxxxxxxxxx > >