Re: [RFC PATCH 5/6] NFS: Use NFSv4.2's OFFLOAD_STATUS operation

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Wed, 3 Jul 2024 18:52:56 +0000

> On Jul 2, 2024, at 2:30 PM, Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
> 
> Again I don't think this approach is going to work because of how
> callback and the copy thread are handled (as of now). In
> handle_async_copy() when we come out of
> wait_for_completion_interruptible() we know the callback has arrived
> and it has signalled the copy thread and thus we remove the copy
> request from the list. However, on the timeout, we didn't receive the
> wake up and thus if we remove the copy from the list then, when the
> callback thread asynchronously receives the callback it won't have the
> request to match it too. And in case the server does support an
> offload_status query I guess that's ok, but imagine it didn't. So now,
> we'd send the offload_status and get not supported and we'd go back to
> waiting but we'd already missed the callback because it came and
> didn't find the matching request is now just dropped on the floor.

If the client reports that it can't find a matching request,
then the server will keep the copy state ID (it's allowed to
delete the copy stateid /only if/ it gets an NFS4_OK in the
CB_OFFLOAD reply, as I read the spec).

The client will wait again, then send another OFFLOAD_STATUS
in a few seconds, and will see that the COPY completed. The
server is then allowed to delete the copy stateid.

---

Again, if a server doesn't support OFFLOAD_STATUS, the only
reliable recourse is for the client to stop using async COPY.
My patch series doesn't implement that, currently. It could,
say, send a dummy OFFLOAD_STATUS before its first COPY
operation to determine whether to use sync or async COPY
going forward.

The block layer folks I talked to at LSF unanimously stated
that there is no way to implement COPY offload reliably
without the ability for the client/initiator to probe copy
status.

IMO the spec should say (but probably does not) that server
MUST implement OFFLOAD_STATUS if it supports async COPY.
Likewise for the client.

--
Chuck Lever