Re: [PATCH] NFS: Handle missing attributes in OPEN reply

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 3, 2023 at 9:17 PM Trond Myklebust <trondmy@xxxxxxxxxx> wrote:
>
> On Tue, 2023-01-03 at 21:02 -0500, Olga Kornievskaia wrote:
> > On Tue, Jan 3, 2023 at 7:46 PM Trond Myklebust <trondmy@xxxxxxxxxx>
> > wrote:
> > >
> > > On Wed, 2023-01-04 at 11:27 +1100, NeilBrown wrote:
> > > >
> > > > If a NFSv4 OPEN reply reports that the file was successfully
> > > > opened
> > > > but
> > > > the subsequent GETATTR fails, Linux-NFS will attempt a stand-
> > > > alone
> > > > GETATTR request.  If that also fails, handling of the reply is
> > > > aborted
> > > > with error -EAGAIN and the open is attempted again from the
> > > > start.
> > > >
> > > > This leaves the server with an active state (because the OPEN
> > > > operation
> > > > succeeded) which the client doesn't know about.  If the open-
> > > > owner
> > > > (local user) did not have the file already open, this has minimal
> > > > consequences for the client and only causes the server to spend
> > > > resources on an open state that will never be used or explicitly
> > > > closed.
> > > >
> > > > If the open-owner DID already have the file open, then it will
> > > > hold a
> > > > reference to the open-state for that file, but the seq-id in the
> > > > state-id will now be out-of-sync with the server.  The server
> > > > will
> > > > have
> > > > incremented the seq-id, but the client will not have noticed.  So
> > > > when
> > > > the client next attempts to access the file using that state
> > > > (READ,
> > > > WRITE, SETATTR), the attempt will fail with NFS4ERR_OLD_STATEID.
> > > >
> > > > The Linux-client assumes this error is due to a race and simply
> > > > retries
> > > > on the basis that the local state-id information should have been
> > > > updated by another thread.  This basis is invalid in this case
> > > > and
> > > > the
> > > > result is an infinite loop attempting IO and getting OLD_STATEID.
> > > >
> > > > This has been observed with a NetApp Filer as the server (ONTAP
> > > > 9.8
> > > > p5,
> > > > using NFSv4.0).  The client is creating, writing, and unlinking a
> > > > particular file from multiple clients (.bash_history).  If a new
> > > > OPEN
> > > > from one client races with a REMOVE from another client while the
> > > > first
> > > > client already has the file open, the Filer can report success
> > > > for
> > > > the
> > > > OPEN op, but NFS4ERR_STALE for the ACCESS or GETATTR ops in the
> > > > OPEN
> > > > request.  This gets the seq-id out-of-sync and a subsequent write
> > > > to
> > > > the
> > > > other open on the first client causes the infinite loop to occur.
> > > >
> > > > The reason that the client returns -EAGAIN is that it needs to
> > > > find
> > > > the
> > > > inode so it can find the associated state to update the seq-id,
> > > > but
> > > > the
> > > > inode lookup requires the file-id which is provided in the
> > > > GETATTR
> > > > reply.  Without the file-id normal inode lookup cannot be used.
> > > >
> > > > This patch changes the lookup so that when the file-id is not
> > > > available
> > > > the list of states owned by the open-owner is examined to find
> > > > the
> > > > state
> > > > with the correct state-id (ignoring the seq-id part of the state-
> > > > id).
> > > > If this is found it is used just as when a normal inode lookup
> > > > finds
> > > > an
> > > > inode.  If it isn't found, -EAGAIN is returned as before.
> > > >
> > > > This bug can be demonstrated by modifying the Linux NFS server as
> > > > follows:
> > > >
> > > > 1/ The second time a file is opened, unlink it.  This simulates
> > > >    a race with another client, without needing to have a race:
> > > >
> > > >     --- a/fs/nfsd/nfs4proc.c
> > > >     +++ b/fs/nfsd/nfs4proc.c
> > > >     @@ -594,6 +594,12 @@ nfsd4_open(struct svc_rqst *rqstp,
> > > > struct
> > > > nfsd4_compound_state *cstate,
> > > >         if (reclaim && !status)
> > > >                 nn->somebody_reclaimed = true;
> > > >      out:
> > > >     +   if (!status && open->op_stateid.si_generation > 1) {
> > > >     +           printk("Opening gen %d\n", (int)open-
> > > > > op_stateid.si_generation);
> > > >     +           vfs_unlink(mnt_user_ns(resfh->fh_export-
> > > > > ex_path.mnt),
> > > >     +                      resfh->fh_dentry->d_parent->d_inode,
> > > >     +                      resfh->fh_dentry, NULL);
> > > >     +   }
> > > >         if (open->op_filp) {
> > > >                 fput(open->op_filp);
> > > >                 open->op_filp = NULL;
> > > >
> > > > 2/ When a GETATTR op is attempted on an unlinked file, return
> > > > ESTALE
> > > >
> > > >     @@ -852,6 +858,11 @@ nfsd4_getattr(struct svc_rqst *rqstp,
> > > > struct
> > > > nfsd4_compound_state *cstate,
> > > >         if (status)
> > > >                 return status;
> > > >
> > > >     +   if (cstate->current_fh.fh_dentry->d_inode->i_nlink == 0)
> > > > {
> > > >     +           printk("Return Estale for unlinked file\n");
> > > >     +           return nfserr_stale;
> > > >     +   }
> > > >     +
> > > >         if (getattr->ga_bmval[1] & NFSD_WRITEONLY_ATTRS_WORD1)
> > > >                 return nfserr_inval;
> > > >
> > > > Then mount the filesystem and
> > > >
> > > >   Thread 1            Thread 2
> > > >   open a file
> > > >                       open the same file (will fail)
> > > >   write to that file
> > > >
> > > > I use this shell fragment, using 'sleep' for synchronisation.
> > > > The use of /bin/echo ensures the write is flushed when /bin/echo
> > > > closes
> > > > the fd on exit.
> > > >
> > > >     (
> > > >         /bin/echo hello
> > > >         sleep 3
> > > >         /bin/echo there
> > > >     ) > /import/A/afile &
> > > >     sleep 3
> > > >     cat /import/A/afile
> > > >
> > > > Probably when the OPEN succeeds, the GETATTR fails, and we don't
> > > > already
> > > > have the state open, we should explicitly close the state.
> > > > Leaving
> > > > it
> > > > open could cause problems if, for example, the server revoked it
> > > > and
> > > > signalled the client that there was a revoked state.  The client
> > > > would
> > > > not be able to find the state that needed to be relinquished.  I
> > > > haven't
> > > > attempted to implement this.
> > >
> > >
> > > If the server starts to reply NFS4ERR_STALE to GETATTR requests,
> > > why do
> > > we care about stateid values?
> >
> > It is acceptable for the server to return ESTALE to the GETATTR after
> > the processing the open (due to a REMOVE that comes in) and that open
> > generating a valid stateid which client should care about when there
> > are pre-existing opens. The server will keep the state of an existing
> > opens valid even if the file is removed. Which is what's happening,
> > the previous open is being used for IO but the stateid is updated on
> > the server but not on the client.
> >
> > > Just mark the inode as stale and drop it
> > > on the floor.
> >
> > Why would that be correct? Any pre-existing opens should continue
> > operating, thus the inode can't be marked stale. We don't do it now
> > (silly rename allows preexisting IO to continue).
> >
> > > If the server tries to declare the state as revoked, then it is
> > > clearly
> > > borken, so let NetApp fix their own bug.
> >
> > The server does not declare the state revoked. The open succeeded and
> > its state's seqid was updated but the client is using an old stateid.
> > Server isn't at fault here.
>
> If the client can't send a GETATTR, then it can't revalidate
> attributes, do I/O, and it can't even close the file. So we're not
> going to waste time and effort trying to support this, whether or not
> NetApp thinks it is a good idea.

That makes sense but I think the server should fail PUTFHs in the
compounds after the REMOVE was processed, not the other (GETATTR,
WRITE) operations, right?

>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@xxxxxxxxxxxxxxx
>
>



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux