Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jan 27, 2020, at 9:45 AM, Robert Milkowski <rmilkowski@xxxxxxxxx> wrote:
> 
> On Thu, 23 Jan 2020 at 19:08, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
>> 
>> On Wed, 2020-01-22 at 19:10 +0000, Schumaker, Anna wrote:
>>> Hi Robert,
>>> 
>>> On Mon, 2020-01-20 at 17:55 +0000, Robert Milkowski wrote:
>>>>> -----Original Message-----
>>>>> From: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>>>> Sent: 30 December 2019 15:37
>>>>> To: Robert Milkowski <rmilkowski@xxxxxxxxx>
>>>>> Cc: Linux NFS Mailing List <linux-nfs@xxxxxxxxxxxxxxx>; Trond
>>>>> Myklebust
>>>>> <trond.myklebust@xxxxxxxxxxxxxxx>; Anna Schumaker
>>>>> <anna.schumaker@xxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
>>>>> Subject: Re: [PATCH v3] NFSv4.0: nfs4_do_fsinfo() should not do
>>>>> implicit
>>>>> lease renewals
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Dec 30, 2019, at 10:20 AM, Robert Milkowski <
>>>>>> rmilkowski@xxxxxxxxx>
>>>>> wrote:
>>>>>> From: Robert Milkowski <rmilkowski@xxxxxxxxx>
>>>>>> 
>>>>>> Currently, each time nfs4_do_fsinfo() is called it will do an
>>>>>> implicit
>>>>>> NFS4 lease renewal, which is not compliant with the NFS4
>>>>> specification.
>>>>>> This can result in a lease being expired by an NFS server.
>>>>>> 
>>>>>> Commit 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
>>>>>> leases")
>>>>>> introduced implicit client lease renewal in nfs4_do_fsinfo(),
>>>>>> which
>>>>>> can result in the NFSv4.0 lease to expire on a server side, and
>>>>>> servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.
>>>>>> 
>>>>>> This can easily be reproduced by frequently unmounting a sub-
>>>>>> mount,
>>>>>> then stat'ing it to get it mounted again, which will delay or
>>>>>> even
>>>>>> completely prevent client from sending RENEW operations if no
>>>>>> other
>>>>>> NFS operations are issued. Eventually nfs server will expire
>>>>>> client's
>>>>>> lease and return an error on file access or next RENEW.
>>>>>> 
>>>>>> This can also happen when a sub-mount is automatically
>>>>>> unmounted due
>>>>>> to inactivity (after nfs_mountpoint_expiry_timeout), then it is
>>>>>> mounted again via stat(). This can result in a short window
>>>>>> during
>>>>>> which client's lease will expire on a server but not on a
>>>>>> client.
>>>>>> This specific case was observed on production systems.
>>>>>> 
>>>>>> This patch makes an explicit lease renewal instead of an
>>>>>> implicit one,
>>>>>> by adding RENEW to a compound operation issued by
>>>>>> nfs4_do_fsinfo(),
>>>>>> similarly to NFSv4.1 which adds SEQUENCE operation.
>>>>>> 
>>>>>> Fixes: 83ca7f5ab31f ("NFS: Avoid PUTROOTFH when managing
>>>>>> leases")
>>>>>> Signed-off-by: Robert Milkowski <rmilkowski@xxxxxxxxx>
>>>>> 
>>>>> Reviewed-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>>>>> 
>>>>> 
>>>> 
>>>> How do we progress it further?
>>> 
>>> Thanks for following up! I have the patch included in my linux-next
>>> branch for
>>> the next merge window.
>> 
>> NACK. This is the wrong way to solve the problem. Lease renewal in
>> NFSv4 does not need to be tied to fsinfo. It creates an unnecessary
>> extra error condition that has absolutely nothing to do with the
>> functionality of retrieving per-filesystem attributes.
> 
> Well, we also do it for NFSv4.1+ with the SEQUENCE operation being
> send by fsinfo, and as Chuck pointed out
> it makes sense to do similarly for 4.0, which is what this patch does.

I did say that.

However, I can see that for NFSv4.1+, the client code handling the
SEQUENCE response will update cl_last_renewal. It does not need to
be done in the fsinfo code.

The NFSv4.0 behavior should be correct if cl_last_renewal is not
updated. That should force the client to send a separate RENEW
operation so that both the client and server agree that the lease
is active.

If I understand Trond correctly?


> Or as per the v2 version of the patch, not do the implicit RENEW for
> 4.0, which was a simpler patch,
> but then not in-line with 4.1+.
> 
>> 
>> All that needs to be done here is to move the setting of clp-
>>> cl_last_renewal _out_ of nfs4_set_lease_period(), and just have
>> nfs4_proc_setclientid_confirm() and nfs4_update_session() call
>> do_renew_lease().
>> 
> 
> This would also require nfs4_setup_state_renewal() to call
> do_renew_lease() I think - at least it currently calls
> nfs4_set_lease_period().
> Also, iirc fsinfo() not setting cl_last_renewal leads to
> cl_last_renewal initialization issues under some circumstances.
> 
> Then the RFC 7530 in section 16.34.5 states:
> "SETCLIENTID_CONFIRM does not establish or renew a lease.", so calling
> do_renew_lease() from nfs4_setclientid_confirm() doesn't seem to be
> ok.
> 
> I'm not sure if is is valid to do implicit lease renewal in
> nfs4_update_session() either...
> 
> 
> Anyway, the patch would be something like (haven't tested it yet):
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 76d3716..a7af864 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5019,16 +5019,13 @@ static int nfs4_do_fsinfo(struct nfs_server
> *server, struct nfs_fh *fhandle, str
>        struct nfs4_exception exception = {
>                .interruptible = true,
>        };
> -       unsigned long now = jiffies;
>        int err;
> 
>        do {
>                err = _nfs4_do_fsinfo(server, fhandle, fsinfo);
>                trace_nfs4_fsinfo(server, fhandle, fsinfo->fattr, err);
>                if (err == 0) {
> -                       nfs4_set_lease_period(server->nfs_client,
> -                                       fsinfo->lease_time * HZ,
> -                                       now);
> +                       nfs4_set_lease_period(server->nfs_client,
> fsinfo->lease_time * HZ)
>                        break;
>                }
>                err = nfs4_handle_exception(server, err, &exception);
> @@ -6146,6 +6143,10 @@ int nfs4_proc_setclientid_confirm(struct nfs_client *clp,
>                clp->cl_clientid);
>        status = rpc_call_sync(clp->cl_rpcclient, &msg,
>                               RPC_TASK_TIMEOUT | RPC_TASK_NO_ROUND_ROBIN);
> +       if(status == 0) {
> +               unsigned long now = jiffies;
> +               do_renew_lease(clp, now);
> +       }
>        trace_nfs4_setclientid_confirm(clp, status);
>        dprintk("NFS reply setclientid_confirm: %d\n", status);
>        return status;
> @@ -8590,6 +8591,8 @@ static void nfs4_update_session(struct
> nfs4_session *session,
>        if (res->flags & SESSION4_BACK_CHAN)
>                memcpy(&session->bc_attrs, &res->bc_attrs,
>                                sizeof(session->bc_attrs));
> +       unsigned long now = jiffies;
> +       do_renew_lease(session->clp, now);
> }
> 
> static int _nfs4_proc_create_session(struct nfs_client *clp,
> diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c
> index 6ea431b..ff876dd 100644
> --- a/fs/nfs/nfs4renewd.c
> +++ b/fs/nfs/nfs4renewd.c
> @@ -138,15 +138,12 @@
>  *
>  * @clp: pointer to nfs_client
>  * @lease: new value for lease period
> - * @lastrenewed: time at which lease was last renewed
>  */
> void nfs4_set_lease_period(struct nfs_client *clp,
> -               unsigned long lease,
> -               unsigned long lastrenewed)
> +               unsigned long lease)
> {
>        spin_lock(&clp->cl_lock);
>        clp->cl_lease_time = lease;
> -       clp->cl_last_renewal = lastrenewed;
>        spin_unlock(&clp->cl_lock);
> 
>        /* Cap maximum reconnect timeout at 1/2 lease period */
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 3455232..d7b02fd 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -102,7 +102,8 @@ static int nfs4_setup_state_renewal(struct nfs_client *clp)
>        now = jiffies;
>        status = nfs4_proc_get_lease_time(clp, &fsinfo);
>        if (status == 0) {
> -               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ, now);
> +               nfs4_set_lease_period(clp, fsinfo.lease_time * HZ);
> +               nfs4_do_renew_lease(clp, now);
>                nfs4_schedule_state_renewal(clp);
>        }
> 
> 
> 
> But it potentially has issues, as just pointed out, mainly with not
> being compliant with rfc again.
> 
> Also see the comments above about the SEQUENCE.
> Trond - ?
> 
> Chuck, Anna - which way do you prefer, the one proposed by Chuck or
> the one now proposed by Trond?
> 
> Or perhaps my v2 patch which is least invasive and just stops doing
> implicit renewals for 4.0 if cl_last_renewal is already set.
> 
> I personally think the way proposed by Chuck is fully compliant with
> RFC and in-line with what we currently do for 4.1 via SEQUENCE,
> so perhaps the best option. ?
> 
> -- 
> Robert Milkowski

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux