Re: NFSv4: Mounting NFS server which is down, blocks all other NFS mounts on same machine

Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> · Wed, 9 Jun 2021 15:19:14 +0000

On Wed, 2021-06-09 at 11:00 -0400, Benjamin Coddington wrote:
> On 9 Jun 2021, at 10:41, Trond Myklebust wrote:
> 
> > On Wed, 2021-06-09 at 10:31 -0400, Benjamin Coddington wrote:
> > > 
> > > It's not disputed that mounts waiting on the transport layer will
> > > block
> > > other mounts.
> > > 
> > > It might be able to be changed:  there's this torch:
> > > https://lore.kernel.org/linux-nfs/87378omld4.fsf@xxxxxxxxxxxxxxxxxxxxxxxx/
> > > 
> > 
> > No.
> > 
> > > ..or there may be another way we don't have to wait ..
> > > 
> > 
> > OK. So let's look at how we can solve the problem of the initial
> > connection to the server timing out and causing hangs in
> > nfs41_walk_client_list(), and leave aside any other corner case
> > problems (such as the server going down while we're mounting).
> > 
> > How about something like the following (compile tested only) patch?
> 
> It works as intended for this case, but I don't have my head wrapped
> around
> the implications of the change yet.
> 

The main implications are that if you have 100 mounts all going to the
same server that is down, then you'll get 100 connection attempts. If
the server is up, but has wonky rpc.gssd service, then you'll get 100
attempts to set up krb5i...

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx