Re: [PATCH v2 1/3] NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()

Chuck Lever <chuck.lever@xxxxxxxxxx> · Fri, 17 Nov 2023 10:08:22 -0500

On Fri, Nov 17, 2023 at 09:57:49AM -0500, Jeff Layton wrote:
> On Fri, 2023-11-10 at 11:28 -0500, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > 
> > The "statp + 1" pointer that is passed to nfsd_cache_update() is
> > supposed to point to the start of the egress NFS Reply header. In
> > fact, it does point there for AUTH_SYS and RPCSEC_GSS_KRB5 requests.
> > 
> > But both krb5i and krb5p add fields between the RPC header's
> > accept_stat field and the start of the NFS Reply header. In those
> > cases, "statp + 1" points at the extra fields instead of the Reply.
> > The result is that nfsd_cache_update() caches what looks to the
> > client like garbage.
> > 
> > A connection break can occur for a number of reasons, but the most
> > common reason when using krb5i/p is a GSS sequence number window
> > underrun. When an underrun is detected, the server is obliged to
> > drop the RPC and the connection to force a retransmit with a fresh
> > GSS sequence number. The client presents the same XID, it hits in
> > the server's DRC, and the server returns the garbage cache entry.
> > 
> > The "statp + 1" argument has been used since the oldest changeset
> > in the kernel history repo, so it has been in nfsd_dispatch()
> > literally since before history began. The problem arose only when
> > the server-side GSS implementation was added twenty years ago.
> > 
> > This particular patch applies cleanly to v6.5 and later, but needs
> > some context adjustment to apply to earlier kernels. Before v5.16,
> > nfsd_dispatch() does not use xdr_stream, so saving the NFS header
> > pointer before calling ->pc_encode is still an appropriate fix
> > but it needs to be implemented differently.
> > 
> > Cc: <stable@xxxxxxxxxxxxxxx> # v5.16+
> > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
> > ---
> >  fs/nfsd/nfssvc.c |    4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > index d6122bb2d167..60aacca2bca6 100644
> > --- a/fs/nfsd/nfssvc.c
> > +++ b/fs/nfsd/nfssvc.c
> > @@ -981,6 +981,7 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
> >  	const struct svc_procedure *proc = rqstp->rq_procinfo;
> >  	__be32 *statp = rqstp->rq_accept_statp;
> >  	struct nfsd_cacherep *rp;
> > +	__be32 *nfs_reply;
> >  
> >  	/*
> >  	 * Give the xdr decoder a chance to change this if it wants
> > @@ -1014,6 +1015,7 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
> >  	if (test_bit(RQ_DROPME, &rqstp->rq_flags))
> >  		goto out_update_drop;
> >  
> > +	nfs_reply = xdr_inline_decode(&rqstp->rq_res_stream, 0);
> >  	if (!proc->pc_encode(rqstp, &rqstp->rq_res_stream))
> >  		goto out_encode_err;
> >  
> > @@ -1023,7 +1025,7 @@ int nfsd_dispatch(struct svc_rqst *rqstp)
> >  	 */
> >  	smp_store_release(&rqstp->rq_status_counter, rqstp->rq_status_counter + 1);
> >  
> > -	nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, statp + 1);
> > +	nfsd_cache_update(rqstp, rp, rqstp->rq_cachetype, nfs_reply);
> >  out_cached_reply:
> >  	return 1;
> >  
> > 
> > 
> 
> With this patch, I'm seeing a regression in pynfs RPLY14. In the
> attached capture the client sends a replay of an earlier call, and the
> server responds (frame #97) with a reply that is truncated just after
> the RPC accept state.

I've reproduced it. Looking now.

-- 
Chuck Lever