Re: [RFC PATCH] NFS: CB_OFFLOAD should return DELAY when no copy state ID matches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/13/25 1:44 PM, Trond Myklebust wrote:
> On Thu, 2025-02-13 at 12:53 -0500, Olga Kornievskaia wrote:
>> On Thu, Feb 13, 2025 at 11:59 AM Trond Myklebust
>> <trondmy@xxxxxxxxxxxxxxx> wrote:
>>>
>>> On Thu, 2025-02-13 at 11:15 -0500, cel@xxxxxxxxxx wrote:
>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>>>
>>>> The NFSv4.2 protocol requires that a client match a CB_OFFLOAD
>>>> callback to a COPY reply containing the same copy state ID.
>>>> However,
>>>> it's possible that the order of the callback and reply processing
>>>> on
>>>> the client can cause the CB_OFFLOAD to be received and processed
>>>> /before/ the client has dealt with the COPY reply.
>>>>
>>>> Currently, in this case, the Linux NFS client will queue a fresh
>>>> struct nfs4_copy_state in the CB_OFFLOAD handler.
>>>> handle_async_copy() then checks for a matching nfs4_copy_state
>>>> before settling down to wait for a CB_OFFLOAD reply.
>>>>
>>>> But it would be simpler for the client's callback service to
>>>> respond
>>>> to such a CB_OFFLOAD with "I'm not ready yet" and have the server
>>>> send the CB_OFFLOAD again later. This avoids the need for the
>>>> client's CB_OFFLOAD processing to allocate an extra struct
>>>> nfs4_copy_state -- in most cases that allocation will be tossed
>>>> immediately, and it's one less memory allocation that we have to
>>>> worry about accidentally leaking or accumulating over time.
>>>
>>> Why can't the server just fill an appropriate entry for
>>> csa_referring_call_lists<> in the CB_SEQUENCE operation for the
>>> CB_OFFLOAD callback? That's the mechanism that is intended to be
>>> used
>>> to avoid the above kind of race.
>>
>> Let's say the linux server does implement the list but what about
>> other implementations that will not. The client still needs an
>> approach to handle CB_OFFLOAD/COPY reply.
>>>
> 
> There are several cases that need to be handled. Off the top of my
> head:
>    1. The reply to COPY hasn't yet been processed.
>    2. The COPY is complete, and the state has been forgotten.
>    3. The stateid presented by CB_OFFLOAD is one that was reused for a
>       second COPY request after a previous one completed.
> 
> The client will want to send different errors for either case
> (NFS4ERR_DELAY in the first and third case, NFS4ERR_BAD_STATEID in the
> second).
> With csa_referring_call_lists<>, the client can easily distinguish
> between the cases and return the right response. Without it, the client
> might end up returning NFS4ERR_BAD_STATEID in case 3, or NFS4ERR_DELAY
> in case 2, etc...
> 
> So in practice, we want all servers to do the right thing if they want
> to avoid confusion over state. The client can't fix these races on its
> own.
> 

We are currently living in a world where all NFSD-based servers do not
return referring calls. I think we need to understand what the client
should do in those cases.

-- 
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux