On Sat, 18 Nov 2023, Chuck Lever wrote: > On Fri, Nov 17, 2023 at 06:41:37AM -0500, Jeff Layton wrote: > > On Fri, 2023-11-17 at 13:18 +1100, NeilBrown wrote: > > > NFS4_CLOSED_DELEG_STID and NFS4_REVOKED_DELEG_STID are similar in > > > purpose. > > > REVOKED is used for NFSv4.1 states which have been revoked because the > > > lease has expired. CLOSED is used in other cases. > > > The difference has two practical effects. > > > 1/ REVOKED states are on the ->cl_revoked list > > > 2/ REVOKED states result in nfserr_deleg_revoked from > > > nfsd4_verify_open_stid() asnd nfsd4_validate_stateid while > > > CLOSED states result in nfserr_bad_stid. > > > > > > Currently a state that is being revoked is first set to "CLOSED" in > > > unhash_delegation_locked(), then possibly to "REVOKED" in > > > revoke_delegation(), at which point it is added to the cl_revoked list. > > > > > > It is possible that a stateid test could see the CLOSED state > > > which really should be REVOKED, and so return the wrong error code. So > > > it is safest to remove this window of inconsistency. > > > > > > With this patch, unhash_delegation_locked() always set the state > > > correctly, and revoke_delegation() no longer changes the state. > > > > > > Also remove a redundant test on minorversion when > > > NFS4_REVOKED_DELEG_STID is seen - it can only be seen when minorversion > > > is non-zero. > > > > > > Signed-off-by: NeilBrown <neilb@xxxxxxx> > > > --- > > > fs/nfsd/nfs4state.c | 20 ++++++++++---------- > > > 1 file changed, 10 insertions(+), 10 deletions(-) > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > index 6368788a7d4e..7469583382fb 100644 > > > --- a/fs/nfsd/nfs4state.c > > > +++ b/fs/nfsd/nfs4state.c > > > @@ -1334,7 +1334,7 @@ static bool delegation_hashed(struct nfs4_delegation *dp) > > > } > > > > > > static bool > > > -unhash_delegation_locked(struct nfs4_delegation *dp) > > > +unhash_delegation_locked(struct nfs4_delegation *dp, unsigned char type) > > unsigned short type ? At this stage in the series 'type' is still an unsigned char. I don't want to get ahead of myself. > > > > > { > > > struct nfs4_file *fp = dp->dl_stid.sc_file; > > > > > > @@ -1343,7 +1343,9 @@ unhash_delegation_locked(struct nfs4_delegation *dp) > > > if (!delegation_hashed(dp)) > > > return false; > > > > > > - dp->dl_stid.sc_type = NFS4_CLOSED_DELEG_STID; > > > + if (dp->dl_stid.sc_client->cl_minorversion == 0) > > > + type = NFS4_CLOSED_DELEG_STID; > > > + dp->dl_stid.sc_type = type; > > > /* Ensure that deleg break won't try to requeue it */ > > > ++dp->dl_time; > > > spin_lock(&fp->fi_lock); > > > @@ -1359,7 +1361,7 @@ static void destroy_delegation(struct nfs4_delegation *dp) > > > bool unhashed; > > > > > > spin_lock(&state_lock); > > > - unhashed = unhash_delegation_locked(dp); > > > + unhashed = unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID); > > > spin_unlock(&state_lock); > > > if (unhashed) > > > destroy_unhashed_deleg(dp); > > > @@ -1373,9 +1375,8 @@ static void revoke_delegation(struct nfs4_delegation *dp) > > > > > > trace_nfsd_stid_revoke(&dp->dl_stid); > > > > > > - if (clp->cl_minorversion) { > > > + if (dp->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) { > > > spin_lock(&clp->cl_lock); > > > - dp->dl_stid.sc_type = NFS4_REVOKED_DELEG_STID; > > > refcount_inc(&dp->dl_stid.sc_count); > > > list_add(&dp->dl_recall_lru, &clp->cl_revoked); > > > spin_unlock(&clp->cl_lock); > > > @@ -2234,7 +2235,7 @@ __destroy_client(struct nfs4_client *clp) > > > spin_lock(&state_lock); > > > while (!list_empty(&clp->cl_delegations)) { > > > dp = list_entry(clp->cl_delegations.next, struct nfs4_delegation, dl_perclnt); > > > - WARN_ON(!unhash_delegation_locked(dp)); > > > + WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID)); > > > list_add(&dp->dl_recall_lru, &reaplist); > > > } > > > spin_unlock(&state_lock); > > > @@ -5197,8 +5198,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open, > > > goto out; > > > if (deleg->dl_stid.sc_type == NFS4_REVOKED_DELEG_STID) { > > > nfs4_put_stid(&deleg->dl_stid); > > > - if (cl->cl_minorversion) > > > - status = nfserr_deleg_revoked; > > > + status = nfserr_deleg_revoked; > > > goto out; > > > } > > > flags = share_access_to_flags(open->op_share_access); > > > @@ -6235,7 +6235,7 @@ nfs4_laundromat(struct nfsd_net *nn) > > > dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru); > > > if (!state_expired(<, dp->dl_time)) > > > break; > > > - WARN_ON(!unhash_delegation_locked(dp)); > > > + WARN_ON(!unhash_delegation_locked(dp, NFS4_REVOKED_DELEG_STID)); > > > list_add(&dp->dl_recall_lru, &reaplist); > > > } > > > spin_unlock(&state_lock); > > > @@ -8350,7 +8350,7 @@ nfs4_state_shutdown_net(struct net *net) > > > spin_lock(&state_lock); > > > list_for_each_safe(pos, next, &nn->del_recall_lru) { > > > dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru); > > > - WARN_ON(!unhash_delegation_locked(dp)); > > > + WARN_ON(!unhash_delegation_locked(dp, NFS4_CLOSED_DELEG_STID)); > > Neil, what say we get rid of these WARN_ONs? > I've added a patch with this intro: Author: NeilBrown <neilb@xxxxxxx> Date: Mon Nov 20 09:15:46 2023 +1100 nfsd: don't call functions with side-effecting inside WARN_ON() Code like: WARN_ON(foo()) looks like an assertion and might not be expected to have any side effects. When testing if a function with side-effects fails a construct like if (foo()) WARN_ON(1); makes the intent more obvious. nfsd has several WARN_ON calls where the test has side effects, so it would be good to change them. These cases don't really need the WARN_ON. They have never failed in 8 years of usage so let's just remove the WARN_ON wrapper. Suggested-by: Chuck Lever <chuck.lever@xxxxxxxxxx> Signed-off-by: NeilBrown <neilb@xxxxxxx> it removes 5 WARN_ONs from unhash_delegation_locked() calls. They were added by Commit 3fcbbd244ed1 ("nfsd: ensure that delegation stateid hash references are only put once") in 4.3 Thanks, NeilBrown > > > > list_add(&dp->dl_recall_lru, &reaplist); > > > } > > > spin_unlock(&state_lock); > > > > Same question here. Should this go to stable? I guess the race is not > > generally fatal... > > Again, there's no existing bug report, so no urgency to get this > backported. I would see these changes soak in upstream rather than > pull them into stable quickly only to discover another bug has been > introduced. > > We don't have a failing test or a data corruption risk, as far as > I can tell. > > > > Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx> > > -- > Chuck Lever >