On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote: > Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in > CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE > races > with the update of the nfs_state: > > Process 1 Process 2 Server > ========= ========= ======== > OPEN file > OPEN file > Reply OPEN (1) > Reply OPEN (2) > Update state (1) > CLOSE file (1) > Reply OLD_STATEID (1) > CLOSE file (2) > Reply CLOSE (-1) > Update state (2) > wait for state change > OPEN file > wake > CLOSE file > OPEN file > wake > CLOSE file > ... > ... > > As long as the first process continues updating state, the second > process > will fail to exit the loop in nfs_set_open_stateid_locked(). This > livelock > has been observed in generic/168. > > Fix this by detecting the case in nfs_need_update_open_stateid() and > then exit the loop if: > - the state is NFS_OPEN_STATE, and > - the stateid sequence is > 1, and > - the stateid doesn't match the current open stateid > > Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in > CLOSE/OPEN_DOWNGRADE") > Cc: stable@xxxxxxxxxxxxxxx # v5.4+ > Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx> > --- > fs/nfs/nfs4proc.c | 27 +++++++++++++++++---------- > 1 file changed, 17 insertions(+), 10 deletions(-) > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index 6e95c85fe395..9db29a438540 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -1588,18 +1588,25 @@ static void > nfs_test_and_clear_all_open_stateid(struct nfs4_state *state) > static bool nfs_need_update_open_stateid(struct nfs4_state *state, > const nfs4_stateid *stateid) > { > - if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 || > - !nfs4_stateid_match_other(stateid, &state->open_stateid)) { > - if (stateid->seqid == cpu_to_be32(1)) > + bool state_matches_other = nfs4_stateid_match_other(stateid, > &state->open_stateid); > + bool seqid_one = stateid->seqid == cpu_to_be32(1); > + > + if (test_bit(NFS_OPEN_STATE, &state->flags)) { > + /* The common case - we're updating to a new sequence > number */ > + if (state_matches_other && > nfs4_stateid_is_newer(stateid, &state->open_stateid)) { > + nfs_state_log_out_of_order_open_stateid(state, > stateid); > + return true; > + } > + } else { > + /* This is the first OPEN */ > + if (!state_matches_other && seqid_one) { Why are we looking at the value of state_matches_other here? If the NFS_OPEN_STATE flags is not set, then the state->open_stateid is not initialised, so how does it make sense to look at a comparison with the incoming stateid? > nfs_state_log_update_open_stateid(state); > - else > + return true; > + } > + if (!state_matches_other && !seqid_one) { Ditto. > set_bit(NFS_STATE_CHANGE_WAIT, &state->flags); > - return true; > - } > - > - if (nfs4_stateid_is_newer(stateid, &state->open_stateid)) { > - nfs_state_log_out_of_order_open_stateid(state, > stateid); > - return true; > + return true; > + } > } > return false; > } -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx