On Fri, 2023-01-06 at 09:56 +1100, NeilBrown wrote: > On Wed, 04 Jan 2023, NeilBrown wrote: > > On Wed, 04 Jan 2023, Trond Myklebust wrote: > > > On Wed, 2023-01-04 at 12:01 +1100, NeilBrown wrote: > > > > On Wed, 04 Jan 2023, Trond Myklebust wrote: > > > > > > > > > > > > > > > If the server starts to reply NFS4ERR_STALE to GETATTR > > > > > requests, > > > > > why do > > > > > we care about stateid values? Just mark the inode as stale > > > > > and drop > > > > > it > > > > > on the floor. > > > > > > > > We have a valid state from the server - we really shouldn't > > > > just > > > > ignore > > > > it. > > > > > > > > Maybe it would be OK to mark the inode stale. I don't know if > > > > various > > > > retry loops abort properly when the inode is stale. > > > > > > Yes, they are all supposed to do that. Otherwise we would end up > > > looping forever in close(), for instance, since the PUTFH, > > > GETATTR and > > > CLOSE can all return NFS4ERR_STALE as well. > > > > To mark the inode as STALE we still need to find the inode, and > > that is > > the key bit missing in the current code. Once we find the inode, > > we > > could mark it stale, but maybe some other error resulted in the > > missing > > GETATTR... > > > > It might make sense to put the new code in _nfs4_proc_open() after > > the > > explicit nfs4_proc_getattr() fails. We would need to find the > > inode > > given only the filehandle. Is there any easy way to do that? > > > > Thanks, > > NeilBrown > > > > I couldn't see a consistent pattern to follow for when to mark an > inode > as stale. Do this, on top of the previous patch, seem reasonable? > > Thanks, > NeilBrown > > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index b441b1d14a50..04497cb42154 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -489,6 +489,8 @@ static int nfs4_do_handle_exception(struct > nfs_server *server, > case -ESTALE: > if (inode != NULL && S_ISREG(inode->i_mode)) > pnfs_destroy_layout(NFS_I(inode)); > + if (inode) > + nfs_set_inode_stale(inode); This is normally dealt with in the generic code inside nfs_revalidate_inode(). There should be no need to duplicate it here. > break; > case -NFS4ERR_DELEG_REVOKED: > case -NFS4ERR_ADMIN_REVOKED: > @@ -2713,8 +2715,12 @@ static int _nfs4_proc_open(struct > nfs4_opendata *data, > return status; > } > if (!(o_res->f_attr->valid & NFS_ATTR_FATTR)) { > + struct inode *inode = nfs4_get_inode_by_stateid( > + &data->o_res.stateid, > + data->owner); There shouldn't be a need to go looking for open descriptors here, because they will hit ESTALE at some point later anyway. If we're going to change anything, I'd rather see us return -EACCES and -ESTALE from the decode_access() and decode_getfattr() calls in nfs4_xdr_dec_open() (and only those errors from those two calls!) so that we can skip the unnecessary getattr call here. In fact, the only case that this extra getattr should be trying to address is the one where the server returns NFS4ERR_DELAY to either the decode_access() or the decode_getfattr() calls specifically, and where we therefore don't want to replay the entire open call. > nfs4_sequence_free_slot(&o_res->seq_res); > - nfs4_proc_getattr(server, &o_res->fh, o_res->f_attr, > NULL); > + nfs4_proc_getattr(server, &o_res->fh, o_res->f_attr, > inode); > + iput(inode); > } > return 0; > } > @@ -4335,6 +4341,7 @@ int nfs4_proc_getattr(struct nfs_server > *server, struct nfs_fh *fhandle, > { > struct nfs4_exception exception = { > .interruptible = true, > + .inode = inode, > }; > int err; > do { -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx