On Fri, Dec 20, 2024 at 10:46 AM <cel@xxxxxxxxxx> wrote: > > From: Chuck Lever <chuck.lever@xxxxxxxxxx> > > We've found that there are cases where a transport disconnection > results in the loss of callback RPCs. NFS servers typically do not > retransmit callback operations after a disconnect. > > This can be a problem for the Linux NFS client's current > implementation of asynchronous COPY, which waits indefinitely for a > CB_OFFLOAD callback. If a transport disconnect occurs while an async > COPY is running, there's a good chance the client will never get the > completing CB_OFFLOAD. > > Fix this by implementing the OFFLOAD_STATUS operation so that the > Linux NFS client can probe the NFS server if it doesn't see a > CB_OFFLOAD in a reasonable amount of time. > > This patch implements a simplistic check. As future work, the client > might also be able to detect whether there is no forward progress on > the request asynchronous COPY operation, and CANCEL it. > > Suggested-by: Olga Kornievskaia <kolga@xxxxxxxxxx> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735 > Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > --- > fs/nfs/nfs42proc.c | 70 ++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 59 insertions(+), 11 deletions(-) > > diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c > index 7fd0f2aa42d4..65cfdb5c7b02 100644 > --- a/fs/nfs/nfs42proc.c > +++ b/fs/nfs/nfs42proc.c > @@ -175,6 +175,25 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len) > return err; > } > > +/* Wait this long before checking progress on a COPY operation */ > +enum { > + NFS42_COPY_TIMEOUT = 3 * HZ, I'm really not a fan of such a short time out. This make the OFFLOAD_STATUS a more likely operation rather than a less likely operation to occur during a copy. OFFLOAD_STATUS and CB_OFFLOAD being concurrent operations introduce races which we have to try to account for. > +}; > + > +static void nfs4_copy_dequeue_callback(struct nfs_server *dst_server, > + struct nfs_server *src_server, > + struct nfs4_copy_state *copy) > +{ > + spin_lock(&dst_server->nfs_client->cl_lock); > + list_del_init(©->copies); > + spin_unlock(&dst_server->nfs_client->cl_lock); > + if (dst_server != src_server) { > + spin_lock(&src_server->nfs_client->cl_lock); > + list_del_init(©->src_copies); > + spin_unlock(&src_server->nfs_client->cl_lock); > + } > +} > + > static int handle_async_copy(struct nfs42_copy_res *res, > struct nfs_server *dst_server, > struct nfs_server *src_server, > @@ -184,9 +203,10 @@ static int handle_async_copy(struct nfs42_copy_res *res, > bool *restart) > { > struct nfs4_copy_state *copy, *tmp_copy = NULL, *iter; > - int status = NFS4_OK; > struct nfs_open_context *dst_ctx = nfs_file_open_context(dst); > struct nfs_open_context *src_ctx = nfs_file_open_context(src); > + int status = NFS4_OK; > + u64 copied; > > copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL); > if (!copy) > @@ -224,15 +244,12 @@ static int handle_async_copy(struct nfs42_copy_res *res, > spin_unlock(&src_server->nfs_client->cl_lock); > } > > - status = wait_for_completion_interruptible(©->completion); > - spin_lock(&dst_server->nfs_client->cl_lock); > - list_del_init(©->copies); > - spin_unlock(&dst_server->nfs_client->cl_lock); > - if (dst_server != src_server) { > - spin_lock(&src_server->nfs_client->cl_lock); > - list_del_init(©->src_copies); > - spin_unlock(&src_server->nfs_client->cl_lock); > - } > +wait: > + status = wait_for_completion_interruptible_timeout(©->completion, > + NFS42_COPY_TIMEOUT); > + if (!status) > + goto timeout; > + nfs4_copy_dequeue_callback(dst_server, src_server, copy); > if (status == -ERESTARTSYS) { > goto out_cancel; > } else if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) { > @@ -242,6 +259,7 @@ static int handle_async_copy(struct nfs42_copy_res *res, > } > out: > res->write_res.count = copy->count; > + /* Copy out the updated write verifier provided by CB_OFFLOAD. */ > memcpy(&res->write_res.verifier, ©->verf, sizeof(copy->verf)); > status = -copy->error; > > @@ -253,6 +271,36 @@ static int handle_async_copy(struct nfs42_copy_res *res, > if (!nfs42_files_from_same_server(src, dst)) > nfs42_do_offload_cancel_async(src, src_stateid); > goto out_free; > +timeout: > + status = nfs42_proc_offload_status(dst, ©->stateid, &copied); Regardless of what OFFLOAD_STATUS returned we have to check whether or not the CB_OFFLOAD had arrived while we were waiting for the reply to the OFFLOAD_STATUS. > + if (status == -EINPROGRESS) > + goto wait; > + nfs4_copy_dequeue_callback(dst_server, src_server, copy); > + switch (status) { > + case 0: > + /* The server recognized the copy stateid, so it hasn't > + * rebooted. Don't overwrite the verifier returned in the > + * COPY result. */ > + res->write_res.count = copied; > + goto out_free; In case OFFLOAD_STATUS was successful and CB_OFFLOAD was received we should take the verifier from the CB_OFFLOAD otherwise we are sending the unneeede and expensive COMMIT because OFFLOAD_STATUS carries the completion and value of copy it does not carry the "how committed" value and thus we are forced to use async copy's "how committed" value. > + case -EREMOTEIO: > + /* COPY operation failed on the server. */ > + status = -EOPNOTSUPP; > + res->write_res.count = copied; > + goto out_free; > + case -EBADF: > + /* Server did not recognize the copy stateid. It has > + * probably restarted and lost the plot. */ > + res->write_res.count = 0; > + status = -EOPNOTSUPP; > + break; This is the case of receiving a BAD_STATEID from OFFLOAD_STATUS and this would lead to copy falling back to read/write operation. IF we don't check the existence of CB_OFFLOAD reply, then OFFLOAD_STATUS can and will carry BAD_STATEID as the server is free to delete copy status after it get CB_OFFLOAD reply (which as i said we have to check). And If we did then we need take the result of the CB_OFFLOAD and not act on OFFLOAD_STATUS's error. > + case -EOPNOTSUPP: > + /* RFC 7862 REQUIREs server to support OFFLOAD_STATUS when > + * it has signed up for an async COPY, so server is not > + * spec-compliant. */ > + res->write_res.count = 0; > + } > + goto out_free; > } > > static int process_copy_commit(struct file *dst, loff_t pos_dst, > @@ -643,7 +691,7 @@ _nfs42_proc_offload_status(struct nfs_server *server, struct file *file, > * Other negative errnos indicate the client could not complete the > * request. > */ > -static int __maybe_unused > +static int > nfs42_proc_offload_status(struct file *dst, nfs4_stateid *stateid, u64 *copied) > { > struct inode *inode = file_inode(dst); > -- > 2.47.0 > >