Re: [PATCH v3 2/2] NFSv4.1: Use the nfs_client's rpc timeouts for backchannel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4 Jan 2024, at 7:13, Benjamin Coddington wrote:

> On 4 Jan 2024, at 6:55, Benjamin Coddington wrote:
>
>> On 3 Jan 2024, at 18:00, Trond Myklebust wrote:
>>
>>> On Wed, 2024-01-03 at 16:45 -0500, Anna Schumaker wrote:
>>>> Hi Ben,
>>>>
>>>> On Wed, Dec 13, 2023 at 2:49 PM Benjamin Coddington
>>>> <bcodding@xxxxxxxxxx> wrote:
>>>>>
>>>>> For backchannel requests that lookup the appropriate nfs_client,
>>>>> use the
>>>>> state-management rpc_clnt's rpc_timeout parameters for the
>>>>> backchannel's
>>>>> response.  When the nfs_client cannot be found, fall back to using
>>>>> the
>>>>> xprt's default timeout parameters.
>>>>
>>>> I'm seeing a use-after-free after applying this patch when using pNFS
>>>> and session trunking. Any idea what's going on? Here is the stack
>>>> trace I'm seeing:
>>>
>>> I'm going to guess that this is happening because nothing is clearing
>>> rqstp->bc_rpc_clnt after a call to svc_process_bc(). So if something
>>> later calls CB_NULL, then the resulting svc_process_bc() will free an
>>> extra reference.
>>
>> Doh!
>>
>>>>
>>>> I hit this while testing against ontap, if that helps.
>>>>
>>>> Thanks,
>>>> Anna
>>
>> Thank you for the test!!
>>
>>>>> --- a/fs/nfs/callback_xdr.c
>>>>> +++ b/fs/nfs/callback_xdr.c
>>>>> @@ -967,6 +967,9 @@ static __be32 nfs4_callback_compound(struct
>>>>> svc_rqst *rqstp)
>>>>>                 nops--;
>>>>>         }
>>>>>
>>>>> +       if (svc_is_backchannel(rqstp) && cps.clp)
>>>>> +               rqstp->bc_rpc_clnt = cps.clp->cl_rpcclient;
>>>
>>>
>>> You can re-create the clnt->cl_timeout in svc_process_bc() by just
>>> storing the values of to_initval and to_retrans here. Why store a
>>> reference to an entire rpc client structure that you don't need?
>>
>> Hmm, I think I started by thinking we could simply set tk_client, but then
>> didn't end up with that for other reasons and just kept passing the single
>> pointer.
>
> .. and the client being NULL was the signal to fallback to using the xprt
> timeouts.  That's lost with to_initval and to_retries because they can be
> set to zero deliberately.

I'm wrong about this ^^.  The mount option timeo must be > 0, so we can
check if to_initval is zero to see if we should fall back to the xprt
default.

Ben





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux