Re: WARN_ON added to rpc_create()

Chuck Lever <chuck.lever@xxxxxxxxxx> · Fri, 19 Aug 2016 11:19:13 -0400

> On Aug 19, 2016, at 11:06 AM, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote:
> 
>> 
>> On Aug 19, 2016, at 10:50 AM, J. Bruce Fields <bfields@xxxxxxxxxx> wrote:
>> 
>> On Thu, Aug 18, 2016 at 06:11:43PM -0400, Chuck Lever wrote:
>>> 
>>>> On Aug 18, 2016, at 5:56 PM, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote:
>>>> 
>>>> On Wed, Aug 03, 2016 at 03:40:11PM -0400, Chuck Lever wrote:
>>>>> 
>>>>>> On Aug 3, 2016, at 1:47 PM, bfields@xxxxxxxxxxxx wrote:
>>>>>> 
>>>>>> On Wed, Aug 03, 2016 at 11:27:47AM -0400, Chuck Lever wrote:
>>>>>>> Hi Bruce-
>>>>>>> 
>>>>>>> I see that commit 39a9beab5acb83176e8b9a4f0778749a09341f1f
>>>>>>> Author:     J. Bruce Fields <bfields@xxxxxxxxxx>
>>>>>>> AuthorDate: Tue May 17 12:38:21 2016 -0400
>>>>>>> 
>>>>>>> rpc: share one xps between all backchannels
>>>>>>> 
>>>>>>> has added this piece of code:
>>>>>>> 
>>>>>>> @@ -452,10 +452,20 @@ static struct rpc_clnt *rpc_create_xprt(struct rpc_create_args *args,
>>>>>>>     struct rpc_clnt *clnt = NULL;
>>>>>>>     struct rpc_xprt_switch *xps;
>>>>>>> 
>>>>>>> -       xps = xprt_switch_alloc(xprt, GFP_KERNEL);
>>>>>>> -       if (xps == NULL) {
>>>>>>> -               xprt_put(xprt);
>>>>>>> -               return ERR_PTR(-ENOMEM);
>>>>>>> +       if (args->bc_xprt && args->bc_xprt->xpt_bc_xps) {
>>>>>>> +               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>>>>> +               xps = args->bc_xprt->xpt_bc_xps;
>>>>>>> +               xprt_switch_get(xps);
>>>>>>> +       } else {
>>>>>>> 
>>>>>>> 
>>>>>>> the WARN_ON here fires on the server whenever I use NFSv4.1 on RDMA.
>>>>>>> 
>>>>>>> Can you say why it was added? Is there something RPC/RDMA needs to
>>>>>>> do to make the code safe?
>>>>>> 
>>>>>> What is args->protocol in this case?
>>>>>> 
>>>>>> Digging around...  OK, I missed that BC_TCP and BC_RDMA were defined as
>>>>>> OR's of an XPRT_TRANSPORT_BC bit with the identifier of the underlying
>>>>>> transport.  That makes sense.
>>>>>> 
>>>>>> So, I should have just used XPRT_TRANSPORT_BC there--I think all I meant
>>>>>> was "is this a backchannel".
>>>>>> 
>>>>>> Does that fix the problem?
>>>>> 
>>>>> This simple fix eliminates the log noise:
>>>>> 
>>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>>> index 2808d55..f94caf7 100644
>>>>> --- a/net/sunrpc/clnt.c
>>>>> +++ b/net/sunrpc/clnt.c
>>>>> @@ -520,7 +520,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
>>>>>      char servername[48];
>>>>> 
>>>>>      if (args->bc_xprt) {
>>>>> -               WARN_ON(args->protocol != XPRT_TRANSPORT_BC_TCP);
>>>>> +               WARN_ON(!(args->protocol & XPRT_TRANSPORT_BC));
>>>>>              xprt = args->bc_xprt->xpt_bc_xprt;
>>>>>              if (xprt) {
>>>>>                      xprt_get(xprt);
>>>>> 
>>>>> 
>>>>> This code seems to come from:
>>>>> 
>>>>> commit d50039ea5ee63c589b0434baa5ecf6e5075bb6f9
>>>>> Author:     J. Bruce Fields <bfields@xxxxxxxxxx>
>>>>> AuthorDate: Mon May 16 17:03:42 2016 -0400
>>>>> 
>>>>>  nfsd4/rpc: move backchannel create logic into rpc code
>>>>> 
>>>>> 
>>>>> Where it may have been copied from:
>>>>> 
>>>>> -static struct rpc_clnt *create_backchannel_client(struct rpc_create_args *args)
>>>>> -{
>>>>> -       struct rpc_xprt *xprt;
>>>>> -
>>>>> -       if (args->protocol != XPRT_TRANSPORT_BC_TCP)
>>>>> -               return rpc_create(args);
>>>>> -
>>>>> -       xprt = args->bc_xprt->xpt_bc_xprt;
>>>>> -       if (xprt) {
>>>>> -               xprt_get(xprt);
>>>>> -               return rpc_create_xprt(args, xprt);
>>>>> -       }
>>>>> -
>>>>> -       return rpc_create(args);
>>>>> -}
>>>>> 
>>>>> There's no warning here. In fact, protocol != BC_TCP seems to
>>>>> be expected.
>>>> 
>>>> The protocol should be BC_TCP (OK, actually just BC) if and only if
>>>> bc_xprt is set.
>>>> 
>>>> (The BC_TCP case is the 4.1+ case, the other is the 4.0 case.  In the
>>>> 4.1+ case, the new client uses an existing (client-initiated)
>>>> connection, in the 4.0 case, the new client must also have a new
>>>> connection.
>>>> 
>>>> In the 4.0 case we'll always create a new xprt, in the 4.1 case we might
>>>> or might not--depends on whether that particular connection has been
>>>> used for a backchannel previously.)
>>> 
>>> OK, but why is a WARN_ON needed here? Why not return -EINVAL,
>>> for example (once you've corrected BC_TCP -> BC) ?
>> 
>> Well, it would be a programming bug, so I'd want a WARN_ON or similar
>> somewhere, I don't care particularly where it is if you see a better way
>> to organize things.
> 
> The way it works now, the WARN_ON fires, but the logic goes ahead
> and creates the transport anyway.
> 
> If this is a programming bug, it should fail and return an error,
> no transport should be created. I can see a WARN_ON being useful
> because it displays a backtrace which identifies the broken
> caller.
> 
> If it is not a programming bug (which is implied by the fact that
> a transport is created anyway) then no WARN_ON is needed.
> 
> If you think it is correct that a WARN_ON fires _and_ a transport
> is created, could a comment be added explaining that? The new
> logic seems less straightforward to me than what it replaces.

Also, WARN_ONCE might be preferable.

--
Chuck Lever

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html