Re: [PATCH] nfsd4: fix up replay_matches_cache()

"J. Bruce Fields" <bfields@xxxxxxxxxxxx> · Wed, 9 Oct 2019 16:33:48 -0400

On Wed, Oct 09, 2019 at 04:19:50PM -0400, Scott Mayhew wrote:
> On Wed, 09 Oct 2019, J. Bruce Fields wrote:
> 
> > On Wed, Oct 09, 2019 at 03:11:37PM -0400, Scott Mayhew wrote:
> > > When running an nfs stress test, I see quite a few cached replies that
> > > don't match up with the actual request.  The first comment in
> > > replay_matches_cache() makes sense, but the code doesn't seem to
> > > match... fix it.
> > 
> > Thanks, I'll apply.  But I'm curious whether you're seeing any practical
> > impact from this?  I don't think it should matter.
> 
> Yes, the client is occasionally getting tied up into knots.  It appears
> to always be a REMOVE request getting a cached OPEN reply, and that
> loops over and over.  It seems like a client bug because when it
> happens, the client sends an OPEN and immediately sends a REMOVE using
> the same slot (bumping the seqid) without waiting for the OPEN reply.
> The server replies with NFS4ERR_SEQ_MISORDERED, and the client
> decrements the seqid and re-sends the REMOVE request.  Then the server
> sends the reply to the original OPEN and sends the cached OPEN reply in
> response to all the subsequent REMOVE requests.  I haven't had much luck
> in tracking it down though...

OK.  So this will make the server reply FALSE_RETRY to the resent
REMOVE, and maybe that helps the client recover in this case?

But, that's not something the client can really depend on--if the REMOVE
happened to have the same number of ops as the OPEN, the false retry
would still go undetected, and that's allowed.  I don't think the server
can be expected to catch every false retry.

No harm making the server's reply more accurate here, and the code's
clearer this way, but sounds to me like the root cause is a client bug.

--b.

> 
> -Scott
> 
> > 
> > --b.
> > 
> > > 
> > > Fixes: 53da6a53e1d4 ("nfsd4: catch some false session retries")
> > > Signed-off-by: Scott Mayhew <smayhew@xxxxxxxxxx>
> > > ---
> > >  fs/nfsd/nfs4state.c | 15 ++++++++++-----
> > >  1 file changed, 10 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index c65aeaa812d4..08f6eb2b73f8 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -3548,12 +3548,17 @@ static bool replay_matches_cache(struct svc_rqst *rqstp,
> > >  	    (bool)seq->cachethis)
> > >  		return false;
> > >  	/*
> > > -	 * If there's an error than the reply can have fewer ops than
> > > -	 * the call.  But if we cached a reply with *more* ops than the
> > > -	 * call you're sending us now, then this new call is clearly not
> > > -	 * really a replay of the old one:
> > > +	 * If there's an error then the reply can have fewer ops than
> > > +	 * the call.
> > >  	 */
> > > -	if (slot->sl_opcnt < argp->opcnt)
> > > +	if (slot->sl_opcnt < argp->opcnt && !slot->sl_status)
> > > +		return false;
> > > +	/*
> > > +	 * But if we cached a reply with *more* ops than the call you're
> > > +	 * sending us now, then this new call is clearly not really a
> > > +	 * replay of the old one:
> > > +	 */
> > > +	if (slot->sl_opcnt > argp->opcnt)
> > >  		return false;
> > >  	/* This is the only check explicitly called by spec: */
> > >  	if (!same_creds(&rqstp->rq_cred, &slot->sl_cred))
> > > -- 
> > > 2.17.2