Re: [PATCH] nfsd: Ensure that NFSv4 always drops the connection when dropping a request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 24 Oct 2014 17:30:55 +0300
Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote:

> On Fri, 2014-10-24 at 09:34 -0400, Jeff Layton wrote:
> > On Fri, 24 Oct 2014 07:26:44 -0400
> > Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote:
> >  
> > > I guess the theory here is that the client sent the request, and got a
> > > TCP ACK and then the server never responded because it dropped the
> > > request? That sounds plausible and we almost certainly want something
> > > like the above for v4.0.
> > > 
> > > If the above patch fixes Christoph's problem with v4.1 though, then I
> > > think we'll need more investigation into why. I don't see a way to get
> > > a dropped request with v4.1. Those mainly occur due to interaction with
> > > the DRC (disabled for v4.1) or with svc_defer (disabled for all v4
> > > requests).
> > > 
> > > Is it possible that one of the nfsd threads is just hung while
> > > processing an RPC and that's why you're not getting a response?
> > > 
> > 
> > Ok, I'm able to reproduce this too and got a network trace. I think
> > Trond is correct as to the cause:
> > 
> > I see a SETATTR call go out and then see a TCP ACK for it a little
> > later, but didn't see a reply sent for it.
> > 
> > I sprinkled in some printks around the "dropit" codepath above and in
> > some other areas however and none of them fired. So while Trond's patch
> > looks like it might be correct, I don't think it'll fix the problem I'm
> > seeing.
> > 
> > Sniffing a little later, it seems like the server replies are being
> > delayed somehow:
> 
> You can get deferrals in the authentication code which trigger drops.
> Here is a variant which tries to deal with that:
> 
> 8<-----------------------------------------------------
> From a9be1292152af2cb651415c44fd3d91b3d017240 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> Date: Fri, 24 Oct 2014 12:54:47 +0300
> Subject: [PATCH v2] nfsd: Ensure that NFSv4 always drops the connection when
>  dropping a request
> 
> Both RFC3530 and RFC5661 impose a requirement on the server that it MUST NOT
> drop a request unless the connection is broken. This is in order to allow
> clients to assume that they can always expect an answer unless the
> connection is broken.
> 
> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
> v2: Also deal with NFSv4 requests dropped due to authentication issues
> 
>  fs/nfsd/nfs4proc.c         |  6 ++++++
>  include/linux/sunrpc/svc.h |  1 +
>  net/sunrpc/svc.c           | 13 +++++++++----
>  3 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index cdeb3cfd6f32..500ac76662a8 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1960,6 +1960,11 @@ static const char *nfsd4_op_name(unsigned opnum)
>  	return "unknown_operation";
>  }
>  
> +static void nfsd4_dropme(struct svc_rqst *rqstp)
> +{
> +	svc_close_xprt(rqstp->rq_xprt);
> +}
> +
>  #define nfsd4_voidres			nfsd4_voidargs
>  struct nfsd4_voidargs { int dummy; };
>  
> @@ -1989,6 +1994,7 @@ struct svc_version	nfsd_version4 = {
>  		.vs_nproc	= 2,
>  		.vs_proc	= nfsd_procedures4,
>  		.vs_dispatch	= nfsd_dispatch,
> +		.vs_dropme	= nfsd4_dropme,
>  		.vs_xdrsize	= NFS4_SVC_XDRSIZE,
>  		.vs_rpcb_optnl	= 1,
>  };
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 21678464883a..824656da1f6d 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -400,6 +400,7 @@ struct svc_version {
>  	 * vs_dispatch == NULL means use default dispatcher.
>  	 */
>  	int			(*vs_dispatch)(struct svc_rqst *, __be32 *);
> +	void			(*vs_dropme)(struct svc_rqst *);
>  };
>  
>  /*
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index ca8a7958f4e6..a178496ce2c1 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -1111,9 +1111,13 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
>  	rqstp->rq_vers = vers = svc_getnl(argv);	/* version number */
>  	rqstp->rq_proc = proc = svc_getnl(argv);	/* procedure number */
>  
> -	for (progp = serv->sv_program; progp; progp = progp->pg_next)
> -		if (prog == progp->pg_prog)
> +	for (progp = serv->sv_program; progp; progp = progp->pg_next) {
> +		if (prog == progp->pg_prog) {
> +			if (vers < progp->pg_nvers)
> +				versp = progp->pg_vers[vers];
>  			break;
> +		}
> +	}
>  
>  	/*
>  	 * Decode auth data, and add verifier to reply buffer.
> @@ -1148,8 +1152,7 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
>  	if (progp == NULL)
>  		goto err_bad_prog;
>  
> -	if (vers >= progp->pg_nvers ||
> -	  !(versp = progp->pg_vers[vers]))
> +	if (versp == NULL)
>  		goto err_bad_vers;
>  
>  	procp = versp->vs_proc + proc;
> @@ -1228,6 +1231,8 @@ svc_process_common(struct svc_rqst *rqstp, struct kvec *argv, struct kvec *resv)
>   dropit:
>  	svc_authorise(rqstp);	/* doesn't hurt to call this twice */
>  	dprintk("svc: svc_process dropit\n");

I don't think this will fix it either. I turned the above dprintk into
a normal printk and it never fired during the test. As best I can tell,
svc_process_common is not returning 0 when this occurs.

> +	if (versp && versp->vs_dropme)
> +		versp->vs_dropme(rqstp);
>  	return 0;
>  
>  err_short_len:


-- 
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux