> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Wed, Aug 10, 2016 at 02:01:20PM -0400, Chuck Lever wrote: >> Following up. >> >> >>> On Aug 3, 2016, at 3:40 PM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: >>> >>>> >>>> On Aug 3, 2016, at 1:47 PM, bfields@xxxxxxxxxxxx wrote: >>>> >>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote: >>>>> Hi Bruce- >>>>> >>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f >>>>> Author: J. Bruce Fields <bfields@xxxxxxxxxx> >>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400 >>>>> >>>>> rpc: share one xps between all backchannels >>>>> >>>>> has added this piece of code: >>>>> >>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args, >>>>> struct rpc_clnt *clnt = NULL; >>>>> struct rpc_xprt_switch *xps; >>>>> >>>>> - xps = xprt_switch_alloc(xprt, GFP_KERNEL); >>>>> - if (xps == NULL) { >>>>> - xprt_put(xprt); >>>>> - return ERR_PTR(-ENOMEM); >>>>> + if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) { >>>>> + WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>>>> + xps = args->bc_xprt->xpt_bc_xps; >>>>> + xprt_switch_get(xps); >>>>> + } else { >>>>> >>>>> >>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA. >>>>> >>>>> Can you say why it was added? Is there something RPC/RDMA needs to >>>>> do to make the code safe? >>>> >>>> What is args->protocol in this case? >>>> >>>> Digging around... OK, I missed that BC_TCP and BC_RDMA were defined as >>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying >>>> transport. That makes sense. >>>> >>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant >>>> was "is this a backchannel". >>>> >>>> Does that fix the problem? >>> >>> This simple fix eliminates the log noise: >>> >>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c >>> index 2808d55..f94caf7 100644 >>> --- a/net/sunrpc/clnt.c >>> +++ b/net/sunrpc/clnt.c >>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) >>> char servername[48]; >>> >>> if (args->bc_xprt) { >>> - WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP); >>> + WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC)); >>> xprt = args->bc_xprt->xpt_bc_xprt; >>> if (xprt) { >>> xprt_get(xprt); >>> >>> >>> This code seems to come from: >>> >>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9 >>> Author: J. Bruce Fields <bfields@xxxxxxxxxx> >>> AuthorDate: Mon May 16 17:03:42 2016 -0400 >>> >>> nfsd4/rpc: move backchannel create logic into rpc code >>> >>> >>> Where it may have been copied from: >>> >>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args) >>> -{ >>> - struct rpc_xprt *xprt; >>> - >>> - if (args->protocol != XPRT_TRANSPORT_BC_TCP) >>> - return rpc_create(args); >>> - >>> - xprt = args->bc_xprt->xpt_bc_xprt; >>> - if (xprt) { >>> - xprt_get(xprt); >>> - return rpc_create_xprt(args, xprt); >>> - } >>> - >>> - return rpc_create(args); >>> -} >>> >>> There's no warning here. In fact, protocol != BC_TCP seems to >>> be expected. >>> >>> I'm wondering if the warning is needed at all? >> >> Using NFSv4.1/RDMA against my v4.7 NFS server seems to result >> in a system deadlock in short order on the server. I haven't >> looked further into this because you mentioned you were going >> to have a look at these commits that change the backchannel >> code. > > I'm not seeing an obvious bug in those commits, for what it's worth. Thanks for checking. I've tracked the misbehavior down to a DMA mapping mismatch. It's not related to the backchannel at all. I'm putting together a fix right now. But I would like to use NFSv4.1/RDMA without that warning firing. Any reason to keep it (in either place) ? -- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html