Re: Thoughts on mount option to configure client lease renewal time.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 16 Aug 2022, Trond Myklebust wrote:
> On Tue, 2022-08-16 at 09:35 +1000, NeilBrown wrote:
> > 
> > Currently the Linux NFS renews leases at 2/3 of the lease time
> > advised
> > by the server.
> > Some server vendors (Not Exactly Targeting Any Particular Party)
> > recommend very short lease times - as short a 5 seconds in fail-over
> > configurations.  This means 1.7 seconds of jitter in any part of the
> > system can result in leases being lost - but it does achieve fast
> > fail-over. 
> > 
> > If we could configure a 5 second lease-renewal on the client, but
> > leave
> > a 60 second lease time on the server, then we could get the best of
> > both
> > worlds.  Failover would happen quickly, but you would need a much
> > longer
> > load spike or network partition to cause the loss of leases.
> > 
> > As v4.1 can end the grace period early once everyone checks in, a
> > large
> > grace period (which is needed for a large lease time) would rarely be
> > a
> > problem.
> > 
> > So my thought is to add a mount option "lease-renew=5" for v4.1+
> > mounts.
> > The clients then uses that number providing it is less than 2/3 of
> > the
> > server-declared lease time.
> > 
> > What do people think of this?  Is there a better solution, or a
> > problem
> > with this one?
> > 
> > NeilBrown
> >  
> 
> I don't see how the NFS client can ever guarantee a 5 second lease
> renewal time, so as far as I'm concerned, this is not a problem we need
> to solve.

I completely agree with the first statement.
The problem we need to solve is whatever problem it is that motivates
server vendors to recommend unrealistically short lease times.

I believe this problem is fail-over time.
Assuming that a server fail-over happens instantly, full NFS service does
not resume until after the grace period completes.

Providing clients send RECLAIM_COMPLETE appropriately, the grace period
could easily be as long as:

  client renew time + time to reclaim all state

as clients that are idle (or busy thinking, not accessing the
filesystem) will not notice the failover until they send a renew, which
may not be until the full renew time has passed.

The only part of that calculation that can be controlled is the client
renew time, andat present that can only be controlled by reducing the
lease time.  Hence the recommendation for a short lease time.

If we could provide an alternate means to reducing the client renew time
- a mount option - then there would be no incentive to recommend an
impractically short lease time.

Thanks,
NeilBrown




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux