Re: [PATCH] NFSv4: fix a mount deadlock in NFS v4.1 client

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2024-09-06 at 00:57 +0000, Oleksandr Tymoshenko wrote:
> nfs41_init_clientid does not signal a failure condition from
> nfs4_proc_exchange_id and nfs4_proc_create_session to a client which
> may
> lead to mount syscall indefinitely blocked in the following stack
> trace:
>   nfs_wait_client_init_complete
>   nfs41_discover_server_trunking
>   nfs4_discover_server_trunking
>   nfs4_init_client
>   nfs4_set_client
>   nfs4_create_server
>   nfs4_try_get_tree
>   vfs_get_tree
>   do_new_mount
>   __se_sys_mount
> 
> and the client stuck in uninitialized state.
> 
> In addition to this all subsequent mount calls would also get blocked
> in
> nfs_match_client waiting for the uninitialized client to finish
> initialization:
>   nfs_wait_client_init_complete
>   nfs_match_client
>   nfs_get_client
>   nfs4_set_client
>   nfs4_create_server
>   nfs4_try_get_tree
>   vfs_get_tree
>   do_new_mount
>   __se_sys_mount
> 
> To avoid this situation propagate error condition to the mount thread
> and let mount syscall fail properly.
> 
> Signed-off-by: Oleksandr Tymoshenko <ovt@xxxxxxxxxx>
> ---
>  fs/nfs/nfs4state.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 877f682b45f2..54ad3440ad2b 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -335,8 +335,8 @@ int nfs41_init_clientid(struct nfs_client *clp,
> const struct cred *cred)
>  	if (!(clp->cl_exchange_flags & EXCHGID4_FLAG_CONFIRMED_R))
>  		nfs4_state_start_reclaim_reboot(clp);
>  	nfs41_finish_session_reset(clp);
> -	nfs_mark_client_ready(clp, NFS_CS_READY);
>  out:
> +	nfs_mark_client_ready(clp, status == 0 ? NFS_CS_READY :
> status);
>  	return status;
>  }

NACK. This will break all sorts of recovery scenarios, because it
doesn't distinguish between an initial 'mount' and a server reboot
recovery situation.
Even in the case where we are in the initial mount, it also doesn't
distinguish between transient errors such as NFS4ERR_DELAY or reboot
errors such as NFS4ERR_STALE_CLIENTID, etc.

Exactly what is the scenario that is causing your hang? Let's try to
address that with a more targeted fix.


-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx







[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux