On Tue, 2020-09-22 at 17:08 -0400, Benjamin Coddington wrote: > On 22 Sep 2020, at 15:48, Trond Myklebust wrote: > > > On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote: > > > Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in > > > CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a > > > CLOSE > > > races > > > with the update of the nfs_state: > > > > > > Process 1 Process 2 Server > > > ========= ========= ======== > > > OPEN file > > > OPEN file > > > Reply OPEN (1) > > > Reply OPEN (2) > > > Update state (1) > > > CLOSE file (1) > > > Reply OLD_STATEID (1) > > > CLOSE file (2) > > > Reply CLOSE (-1) > > > Update state (2) > > > wait for state change > > > OPEN file > > > wake > > > CLOSE file > > > OPEN file > > > wake > > > CLOSE file > > > ... > > > ... > > > > > > As long as the first process continues updating state, the second > > > process > > > will fail to exit the loop in > > > nfs_set_open_stateid_locked(). This > > > livelock > > > has been observed in generic/168. > > > > > > Fix this by detecting the case in nfs_need_update_open_stateid() > > > and > > > then exit the loop if: > > > - the state is NFS_OPEN_STATE, and > > > - the stateid sequence is > 1, and > > > - the stateid doesn't match the current open stateid > > > > > > Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in > > > CLOSE/OPEN_DOWNGRADE") > > > Cc: stable@xxxxxxxxxxxxxxx # v5.4+ > > > Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx> > > > --- > > > fs/nfs/nfs4proc.c | 27 +++++++++++++++++---------- > > > 1 file changed, 17 insertions(+), 10 deletions(-) > > > > > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > > > index 6e95c85fe395..9db29a438540 100644 > > > --- a/fs/nfs/nfs4proc.c > > > +++ b/fs/nfs/nfs4proc.c > > > @@ -1588,18 +1588,25 @@ static void > > > nfs_test_and_clear_all_open_stateid(struct nfs4_state *state) > > > static bool nfs_need_update_open_stateid(struct nfs4_state > > > *state, > > > const nfs4_stateid *stateid) > > > { > > > - if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 || > > > - !nfs4_stateid_match_other(stateid, &state->open_stateid)) { > > > - if (stateid->seqid == cpu_to_be32(1)) > > > + bool state_matches_other = nfs4_stateid_match_other(stateid, > > > &state->open_stateid); > > > + bool seqid_one = stateid->seqid == cpu_to_be32(1); > > > + > > > + if (test_bit(NFS_OPEN_STATE, &state->flags)) { > > > + /* The common case - we're updating to a new sequence > > > number */ > > > + if (state_matches_other && > > > nfs4_stateid_is_newer(stateid, &state->open_stateid)) { > > > + nfs_state_log_out_of_order_open_stateid(state, > > > stateid); > > > + return true; > > > + } > > > + } else { > > > + /* This is the first OPEN */ > > > + if (!state_matches_other && seqid_one) { > > > > Why are we looking at the value of state_matches_other here? If the > > NFS_OPEN_STATE flags is not set, then the state->open_stateid is > > not > > initialised, so how does it make sense to look at a comparison > > with > > the > > incoming stateid? > > You're right - it doesn't make sense. I failed to clean it up out of > my > decision matrix. I'll send a v3 after it gets tested overnight. > > Thanks for the look, and if you have another moment - what do you > think > about not bumping the seqid in the CLOSE/OPEN_DOWNGRADE path on > OLD_STATEID > if the state's refcount > 1? This would optimize away a lot of > recovery > for > races like this, but I wonder if it would be possible to end up with > two > OPEN_DOWNGRADEs dueling that would never recover. > It would lead to a return of the infinite loop in cases where the client misses a reply to an OPEN or CLOSE due to a soft timeout (which is an important use case for us). -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx