Re: [PATCH 1/2 v2] NFSv4: Fix a livelock when CLOSE pre-emptively bumps state sequence

"Benjamin Coddington" <bcodding@xxxxxxxxxx> · Tue, 22 Sep 2020 17:08:40 -0400

On 22 Sep 2020, at 15:48, Trond Myklebust wrote:

On Tue, 2020-09-22 at 15:15 -0400, Benjamin Coddington wrote:
Since commit 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
CLOSE/OPEN_DOWNGRADE") the following livelock may occur if a CLOSE
races
with the update of the nfs_state:

Process 1           Process 2           Server
=========           =========           ========
 OPEN file
                    OPEN file
                                        Reply OPEN (1)
                                        Reply OPEN (2)
 Update state (1)
 CLOSE file (1)
                                        Reply OLD_STATEID (1)
 CLOSE file (2)
                                        Reply CLOSE (-1)
                    Update state (2)
                    wait for state change
 OPEN file
                    wake
 CLOSE file
 OPEN file
                    wake
 CLOSE file
 ...
                    ...

As long as the first process continues updating state, the second
process
will fail to exit the loop in nfs_set_open_stateid_locked().  This
livelock
has been observed in generic/168.

Fix this by detecting the case in nfs_need_update_open_stateid() and
then exit the loop if:
 - the state is NFS_OPEN_STATE, and
 - the stateid sequence is > 1, and
 - the stateid doesn't match the current open stateid

Fixes: 0e0cb35b417f ("NFSv4: Handle NFS4ERR_OLD_STATEID in
CLOSE/OPEN_DOWNGRADE")
Cc: stable@xxxxxxxxxxxxxxx # v5.4+
Signed-off-by: Benjamin Coddington <bcodding@xxxxxxxxxx>
---
 fs/nfs/nfs4proc.c | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 6e95c85fe395..9db29a438540 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1588,18 +1588,25 @@ static void
nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
 static bool nfs_need_update_open_stateid(struct nfs4_state *state,
 		const nfs4_stateid *stateid)
 {
-	if (test_bit(NFS_OPEN_STATE, &state->flags) == 0 ||
-	    !nfs4_stateid_match_other(stateid, &state->open_stateid)) {
-		if (stateid->seqid == cpu_to_be32(1))
+	bool state_matches_other = nfs4_stateid_match_other(stateid,
&state->open_stateid);
+	bool seqid_one = stateid->seqid == cpu_to_be32(1);
+
+	if (test_bit(NFS_OPEN_STATE, &state->flags)) {
+		/* The common case - we're updating to a new sequence
number */
+		if (state_matches_other &&
nfs4_stateid_is_newer(stateid, &state->open_stateid)) {
+			nfs_state_log_out_of_order_open_stateid(state,
stateid);
+			return true;
+		}
+	} else {
+		/* This is the first OPEN */
+		if (!state_matches_other && seqid_one) {

Why are we looking at the value of state_matches_other here? If the
NFS_OPEN_STATE flags is not set, then the state->open_stateid is not
initialised, so how does it make sense to look at a comparison with 
the
incoming stateid?

You're right - it doesn't make sense. I failed to clean it up out of my
decision matrix.  I'll send a v3 after it gets tested overnight.

Thanks for the look, and if you have another moment - what do you think
about not bumping the seqid in the CLOSE/OPEN_DOWNGRADE path on 
OLD_STATEID
if the state's refcount > 1?  This would optimize away a lot of recovery 
for
races like this, but I wonder if it would be possible to end up with two
OPEN_DOWNGRADEs dueling that would never recover.

If you don't have the moment, no problem.  Thanks again for looking at 
my patch.

Ben