On Mon, 27 Jan 2020 at 15:05, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > > > > On Jan 27, 2020, at 9:45 AM, Robert Milkowski <rmilkowski@xxxxxxxxx> wrote: > > > > On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > >> > >> On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote: > >>> Hi Robert, > >>> > >>> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote: > >>>>> -----Original Message----- > >>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx> > >>>>> Sent: 30 December 2019 15:37 > >>>>> To: Robert Milkowski <rmilkowski@xxxxxxxxx> > >>>>> Cc: Linux NFS Mailing List <linux-nfs@xxxxxxxxxxxxxxx>; Trond > >>>>> Myklebust > >>>>> <trond.myklebust@xxxxxxxxxxxxxxx>; Anna Schumaker > >>>>> <anna.schumaker@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx > >>>>> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do > >>>>> implicit > >>>>> lease renewals > >>>>> > >>>>> > >>>>> > >>>>>> On Dec 30, 2019, at 10:20 AM, Robert Milkowski < > >>>>>> rmilkowski@xxxxxxxxx> > >>>>> wrote: > >>>>>> From: Robert Milkowski <rmilkowski@xxxxxxxxx> > >>>>>> > >>>>>> Currently, each time nfs4_do_fsinfo() is called it will do an > >>>>>> implicit > >>>>>> NFS4 lease renewal, which is not compliant with the NFS4 > >>>>> specification. > >>>>>> This can result in a lease being expired by an NFS server. > >>>>>> > >>>>>> Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing > >>>>>> leases") > >>>>>> introduced implicit client lease renewal in nfs4_do_fsinfo(), > >>>>>> which > >>>>>> can result in the NFSv4.0 lease to expire on a server side, and > >>>>>> servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID. > >>>>>> > >>>>>> This can easily be reproduced by frequently unmounting a sub- > >>>>>> mount, > >>>>>> then stat'ing it to get it mounted again, which will delay or > >>>>>> even > >>>>>> completely prevent client from sending RENEW operations if no > >>>>>> other > >>>>>> NFS operations are issued. Eventually nfs server will expire > >>>>>> client's > >>>>>> lease and return an error on file access or next RENEW. > >>>>>> > >>>>>> This can also happen when a sub-mount is automatically > >>>>>> unmounted due > >>>>>> to inactivity (after nfs_mountpoint_expiry_timeout), then it is > >>>>>> mounted again via stat(). This can result in a short window > >>>>>> during > >>>>>> which client's lease will expire on a server but not on a > >>>>>> client. > >>>>>> This specific case was observed on production systems. > >>>>>> > >>>>>> This patch makes an explicit lease renewal instead of an > >>>>>> implicit one, > >>>>>> by adding RENEW to a compound operation issued by > >>>>>> nfs4_do_fsinfo(), > >>>>>> similarly to NFSv4.1 which adds SEQUENCE operation. > >>>>>> > >>>>>> Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing > >>>>>> leases") > >>>>>> Signed-off-by: Robert Milkowski <rmilkowski@xxxxxxxxx> > >>>>> > >>>>> Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > >>>>> > >>>>> > >>>> > >>>> How do we progress it further? > >>> > >>> Thanks for following up! I have the patch included in my linux-next > >>> branch for > >>> the next merge window. > >> > >> NACK. This is the wrong way to solve the problem. Lease renewal in > >> NFSv4 does not need to be tied to fsinfo. It creates an unnecessary > >> extra error condition that has absolutely nothing to do with the > >> functionality of retrieving per-filesystem attributes. > > > > Well, we also do it for NFSv4.1+ with the SEQUENCE operation being > > send by fsinfo, and as Chuck pointed out > > it makes sense to do similarly for 4.0, which is what this patch does. > > I did say that. > > However, I can see that for NFSv4.1+, the client code handling the > SEQUENCE response will update cl_last_renewal. It does not need to > be done in the fsinfo code. I think it is the case, yes. > > The NFSv4.0 behavior should be correct if cl_last_renewal is not > updated. That should force the client to send a separate RENEW > operation so that both the client and server agree that the lease > is active. > I was thinking about removing the call to update the last renewal entirely in do_fsinfo(), however as briefly discussed back in Dec there is an issue with cl_last_renewal initialization on initial mount in 4.0 I observed it to be 0, as nfs4_setup_state_renewal() calls nfs4_proc_get_lease_time() on initial mount and if no error it calls nfs4_set_lease_period(). However with sec=krb the call to nfs4_proc_get_lease_time() returns NFS4ERR_WRONGSEC) during initial mount (which seems to be ok), which results in not setting cl_last_renewal, which iirc prevented scheduling RENEW operations altogether. As discussed then, this is really a separate issue which should be fixed separately. Once fixed then fsinfo() shouldn't need to set cl_last_renewal at all, still sending RENEW in fsinfo() seems like a good idea to make it more inline with what we do for 4.1. > If I understand Trond correctly? > The problem with what Trond proposed is that it seems to go against the rfc, although it should fix the initialization issue. -- Robert Milkowski