> On Jan 27, 2020, at 9:45 AM, Robert Milkowski <rmilkowski@xxxxxxxxx> wrote: > > On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: >> >> On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote: >>> Hi Robert, >>> >>> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote: >>>>> -----Original Message----- >>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx> >>>>> Sent: 30 December 2019 15:37 >>>>> To: Robert Milkowski <rmilkowski@xxxxxxxxx> >>>>> Cc: Linux NFS Mailing List <linux-nfs@xxxxxxxxxxxxxxx>; Trond >>>>> Myklebust >>>>> <trond.myklebust@xxxxxxxxxxxxxxx>; Anna Schumaker >>>>> <anna.schumaker@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx >>>>> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do >>>>> implicit >>>>> lease renewals >>>>> >>>>> >>>>> >>>>>> On Dec 30, 2019, at 10:20 AM, Robert Milkowski < >>>>>> rmilkowski@xxxxxxxxx> >>>>> wrote: >>>>>> From: Robert Milkowski <rmilkowski@xxxxxxxxx> >>>>>> >>>>>> Currently, each time nfs4_do_fsinfo() is called it will do an >>>>>> implicit >>>>>> NFS4 lease renewal, which is not compliant with the NFS4 >>>>> specification. >>>>>> This can result in a lease being expired by an NFS server. >>>>>> >>>>>> Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing >>>>>> leases") >>>>>> introduced implicit client lease renewal in nfs4_do_fsinfo(), >>>>>> which >>>>>> can result in the NFSv4.0 lease to expire on a server side, and >>>>>> servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID. >>>>>> >>>>>> This can easily be reproduced by frequently unmounting a sub- >>>>>> mount, >>>>>> then stat'ing it to get it mounted again, which will delay or >>>>>> even >>>>>> completely prevent client from sending RENEW operations if no >>>>>> other >>>>>> NFS operations are issued. Eventually nfs server will expire >>>>>> client's >>>>>> lease and return an error on file access or next RENEW. >>>>>> >>>>>> This can also happen when a sub-mount is automatically >>>>>> unmounted due >>>>>> to inactivity (after nfs_mountpoint_expiry_timeout), then it is >>>>>> mounted again via stat(). This can result in a short window >>>>>> during >>>>>> which client's lease will expire on a server but not on a >>>>>> client. >>>>>> This specific case was observed on production systems. >>>>>> >>>>>> This patch makes an explicit lease renewal instead of an >>>>>> implicit one, >>>>>> by adding RENEW to a compound operation issued by >>>>>> nfs4_do_fsinfo(), >>>>>> similarly to NFSv4.1 which adds SEQUENCE operation. >>>>>> >>>>>> Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing >>>>>> leases") >>>>>> Signed-off-by: Robert Milkowski <rmilkowski@xxxxxxxxx> >>>>> >>>>> Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx> >>>>> >>>>> >>>> >>>> How do we progress it further? >>> >>> Thanks for following up! I have the patch included in my linux-next >>> branch for >>> the next merge window. >> >> NACK. This is the wrong way to solve the problem. Lease renewal in >> NFSv4 does not need to be tied to fsinfo. It creates an unnecessary >> extra error condition that has absolutely nothing to do with the >> functionality of retrieving per-filesystem attributes. > > Well, we also do it for NFSv4.1+ with the SEQUENCE operation being > send by fsinfo, and as Chuck pointed out > it makes sense to do similarly for 4.0, which is what this patch does. I did say that. However, I can see that for NFSv4.1+, the client code handling the SEQUENCE response will update cl_last_renewal. It does not need to be done in the fsinfo code. The NFSv4.0 behavior should be correct if cl_last_renewal is not updated. That should force the client to send a separate RENEW operation so that both the client and server agree that the lease is active. If I understand Trond correctly? > Or as per the v2 version of the patch, not do the implicit RENEW for > 4.0, which was a simpler patch, > but then not in-line with 4.1+. > >> >> All that needs to be done here is to move the setting of clp- >>> cl_last_renewal _out_ of nfs4_set_lease_period(), and just have >> nfs4_proc_setclientid_confirm() and nfs4_update_session() call >> do_renew_lease(). >> > > This would also require nfs4_setup_state_renewal() to call > do_renew_lease() I think - at least it currently calls > nfs4_set_lease_period(). > Also, iirc fsinfo() not setting cl_last_renewal leads to > cl_last_renewal initialization issues under some circumstances. > > Then the RFC 7530 in section 16.34.5 states: > "SETCLIENTID_CONFIRM does not establish or renew a lease.", so calling > do_renew_lease() from nfs4_setclientid_confirm() doesn't seem to be > ok. > > I'm not sure if is is valid to do implicit lease renewal in > nfs4_update_session() either... > > > Anyway, the patch would be something like (haven't tested it yet): > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index 76d3716..a7af864 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -5019,16 +5019,13 @@ static int nfs4_do_fsinfo(struct nfs_server > *server, struct nfs_fh *fhandle, str > struct nfs4_exception exception = { > .interruptible = true, > }; > - unsigned long now = jiffies; > int err; > > do { > err = _nfs4_do_fsinfo(server, fhandle, fsinfo); > trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr, err); > if (err == 0) { > - nfs4_set_lease_period(server->nfs_client, > - fsinfo->lease_time * HZ, > - now); > + nfs4_set_lease_period(server->nfs_client, > fsinfo->lease_time * HZ) > break; > } > err = nfs4_handle_exception(server, err, &exception); > @@ -6146,6 +6143,10 @@ int nfs4_proc_setclientid_confirm(struct nfs_client *clp, > clp->cl_clientid); > status = rpc_call_sync(clp->cl_rpcclient, &msg, > RPC_TASK_TIMEOUT | RPC_TASK_NO_ROUND_ROBIN); > + if(status == 0) { > + unsigned long now = jiffies; > + do_renew_lease(clp, now); > + } > trace_nfs4_setclientid_confirm(clp, status); > dprintk("NFS reply setclientid_confirm: %d\n", status); > return status; > @@ -8590,6 +8591,8 @@ static void nfs4_update_session(struct > nfs4_session *session, > if (res->flags & SESSION4_BACK_CHAN) > memcpy(&session->bc_attrs, &res->bc_attrs, > sizeof(session->bc_attrs)); > + unsigned long now = jiffies; > + do_renew_lease(session->clp, now); > } > > static int _nfs4_proc_create_session(struct nfs_client *clp, > diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c > index 6ea431b..ff876dd 100644 > --- a/fs/nfs/nfs4renewd.c > +++ b/fs/nfs/nfs4renewd.c > @@ -138,15 +138,12 @@ > * > * @clp: pointer to nfs_client > * @lease: new value for lease period > - * @lastrenewed: time at which lease was last renewed > */ > void nfs4_set_lease_period(struct nfs_client *clp, > - unsigned long lease, > - unsigned long lastrenewed) > + unsigned long lease) > { > spin_lock(&clp->cl_lock); > clp->cl_lease_time = lease; > - clp->cl_last_renewal = lastrenewed; > spin_unlock(&clp->cl_lock); > > /* Cap maximum reconnect timeout at 1/2 lease period */ > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > index 3455232..d7b02fd 100644 > --- a/fs/nfs/nfs4state.c > +++ b/fs/nfs/nfs4state.c > @@ -102,7 +102,8 @@ static int nfs4_setup_state_renewal(struct nfs_client *clp) > now = jiffies; > status = nfs4_proc_get_lease_time(clp, &fsinfo); > if (status == 0) { > - nfs4_set_lease_period(clp, fsinfo.lease_time * HZ, now); > + nfs4_set_lease_period(clp, fsinfo.lease_time * HZ); > + nfs4_do_renew_lease(clp, now); > nfs4_schedule_state_renewal(clp); > } > > > > But it potentially has issues, as just pointed out, mainly with not > being compliant with rfc again. > > Also see the comments above about the SEQUENCE. > Trond - ? > > Chuck, Anna - which way do you prefer, the one proposed by Chuck or > the one now proposed by Trond? > > Or perhaps my v2 patch which is least invasive and just stops doing > implicit renewals for 4.0 if cl_last_renewal is already set. > > I personally think the way proposed by Chuck is fully compliant with > RFC and in-line with what we currently do for 4.1 via SEQUENCE, > so perhaps the best option. ? > > -- > Robert Milkowski -- Chuck Lever