[PATCH] NFSv4: fix a mount deadlock in NFS v4.1 client

Oleksandr Tymoshenko <ovt@xxxxxxxxxx> · Mon, 9 Sep 2024 16:06:21 -0700

On Mon, Sep 9, 2024 at 10:56 AM Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>
> On Mon, 2024-09-09 at 16:36 +0000, Oleksandr Tymoshenko wrote:
> > > > nfs41_init_clientid does not signal a failure condition from
> > > > nfs4_proc_exchange_id and nfs4_proc_create_session to a client
> > > > which
> > > > may
> > > > lead to mount syscall indefinitely blocked in the following stack
> >
> > > NACK. This will break all sorts of recovery scenarios, because it
> > > doesn't distinguish between an initial 'mount' and a server reboot
> > > recovery situation.
> > > Even in the case where we are in the initial mount, it also doesn't
> > > distinguish between transient errors such as NFS4ERR_DELAY or
> > > reboot
> > > errors such as NFS4ERR_STALE_CLIENTID, etc.
> >
> > > Exactly what is the scenario that is causing your hang? Let's try
> > > to
> > > address that with a more targeted fix.
> >
> > The scenario is as follows: there are several NFS servers and several
> > production machines with multiple NFS mounts. This is a containerized
> > multi-tennant workflow so every tennant gets its own NFS mount to
> > access their
> > data. At some point nfs41_init_clientid fails in the initial
> > mount.nfs call
> > and all subsequent mount.nfs calls just hang in
> > nfs_wait_client_init_complete
> > until the original one, where nfs4_proc_exchange_id has failed, is
> > killed.
> >
> > The cause of the nfs41_init_clientid failure in the production case
> > is a timeout.
> > The following error message is observed in logs:
> >   NFS: state manager: lease expired failed on NFSv4 server <ip> with
> > error 110
> >
>
> How about something like the following fix then?
> 8<-----------------------------------------------
> From eb402b489bb0d0ada1a3dd9101d4d7e193402e46 Mon Sep 17 00:00:00 2001
> Message-ID: <eb402b489bb0d0ada1a3dd9101d4d7e193402e46.1725904471.git.trond.myklebust@xxxxxxxxxxxxxxx>
> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> Date: Mon, 9 Sep 2024 13:47:07 -0400
> Subject: [PATCH] NFSv4: Fail mounts if the lease setup times out
>
> If the server is down when the client is trying to mount, so that the
> calls to exchange_id or create_session fail, then we should allow the
> mount system call to fail rather than hang and block other mount/umount
> calls.
>
> Reported-by: Oleksandr Tymoshenko <ovt@xxxxxxxxxx>
> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> ---
>  fs/nfs/nfs4state.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 30aba1dedaba..59dcdf9bc7b4 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -2024,6 +2024,12 @@ static int nfs4_handle_reclaim_lease_error(struct nfs_client *clp, int status)
>                 nfs_mark_client_ready(clp, -EPERM);
>                 clear_bit(NFS4CLNT_LEASE_CONFIRM, &clp->cl_state);
>                 return -EPERM;
> +       case -ETIMEDOUT:
> +               if (clp->cl_cons_state == NFS_CS_SESSION_INITING) {
> +                       nfs_mark_client_ready(clp, -EIO);
> +                       return -EIO;
> +               }
> +               fallthrough;
>         case -EACCES:
>         case -NFS4ERR_DELAY:
>         case -EAGAIN:
> --

This patch fixes the issue in my simulated environment. ETIMEDOUT is
the error code that
was observed in the production env but I guess it's not the only
possible one. Would it make
sense to handle all error conditions in the NFS_CS_SESSION_INITING
state or are there
some others that are recoverable?