On Thu, 2025-02-13 at 12:06 -0500, Chuck Lever wrote: > On 2/13/25 11:54 AM, Trond Myklebust wrote: > > On Thu, 2025-02-13 at 11:15 -0500, cel@xxxxxxxxxx wrote: > > > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > > > > > The NFSv4.2 protocol requires that a client match a CB_OFFLOAD > > > callback to a COPY reply containing the same copy state ID. > > > However, > > > it's possible that the order of the callback and reply processing > > > on > > > the client can cause the CB_OFFLOAD to be received and processed > > > /before/ the client has dealt with the COPY reply. > > > > > > Currently, in this case, the Linux NFS client will queue a fresh > > > struct nfs4_copy_state in the CB_OFFLOAD handler. > > > handle_async_copy() then checks for a matching nfs4_copy_state > > > before settling down to wait for a CB_OFFLOAD reply. > > > > > > But it would be simpler for the client's callback service to > > > respond > > > to such a CB_OFFLOAD with "I'm not ready yet" and have the server > > > send the CB_OFFLOAD again later. This avoids the need for the > > > client's CB_OFFLOAD processing to allocate an extra struct > > > nfs4_copy_state -- in most cases that allocation will be tossed > > > immediately, and it's one less memory allocation that we have to > > > worry about accidentally leaking or accumulating over time. > > > > Why can't the server just fill an appropriate entry for > > csa_referring_call_lists<> in the CB_SEQUENCE operation for the > > CB_OFFLOAD callback? That's the mechanism that is intended to be > > used > > to avoid the above kind of race. > > Intriguing suggestion. > > It would be helpful if that were called out in RFC 7862. Should > support > for referring call lists be a requirement, then, for async COPY > offload? > I don't see a normative mandatory-to-implement statement for rcl's in > RFC 8881, if that matters. No, but in practice it is impossible to resolve several types of races without it. Particularly for delegations. > Practically speaking, though, NFSD callback does not (yet) support > referring call lists. It's been left as an exercise for some time. We > simply haven't had a strong driver for it. Maybe we do now. > > It is also needed for the same kind of race with delegation recalls, layout recalls, CB_NOTIFY_DEVICEID and would also be helpful (although not as strongly required) for CB_NOTIFY_LOCK. IOW: there should be several other incentives for wanting to implement it in knfsd. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx