Re: [PATCH RFC 3/3] nfsd: Initial implementation of NFSv4 Courteous Server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Feb 4, 2022, at 12:02 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
> 
> 
> On 2/4/22 7:25 AM, Chuck Lever III wrote:
>> 
>>> On Feb 3, 2022, at 10:42 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
>>> 
>>> 
>>> On 2/3/22 3:40 PM, Chuck Lever III wrote:
>>>>> On Feb 3, 2022, at 4:38 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
>>>>> 
>>>>> 
>>>>> On 2/3/22 11:31 AM, Chuck Lever III wrote:
>>>>>>> On Jan 28, 2022, at 2:39 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> Currently an NFSv4 client must maintain its lease by using the at least
>>>>>>> one of the state tokens or if nothing else, by issuing a RENEW (4.0), or
>>>>>>> a singleton SEQUENCE (4.1) at least once during each lease period. If the
>>>>>>> client fails to renew the lease, for any reason, the Linux server expunges
>>>>>>> the state tokens immediately upon detection of the "failure to renew the
>>>>>>> lease" condition and begins returning NFS4ERR_EXPIRED if the client should
>>>>>>> reconnect and attempt to use the (now) expired state.
>>>>>>> 
>>>>>>> The default lease period for the Linux server is 90 seconds.  The typical
>>>>>>> client cuts that in half and will issue a lease renewing operation every
>>>>>>> 45 seconds. The 90 second lease period is very short considering the
>>>>>>> potential for moderately long term network partitions.  A network partition
>>>>>>> refers to any loss of network connectivity between the NFS client and the
>>>>>>> NFS server, regardless of its root cause.  This includes NIC failures, NIC
>>>>>>> driver bugs, network misconfigurations & administrative errors, routers &
>>>>>>> switches crashing and/or having software updates applied, even down to
>>>>>>> cables being physically pulled.  In most cases, these network failures are
>>>>>>> transient, although the duration is unknown.
>>>>>>> 
>>>>>>> A server which does not immediately expunge the state on lease expiration
>>>>>>> is known as a Courteous Server.  A Courteous Server continues to recognize
>>>>>>> previously generated state tokens as valid until conflict arises between
>>>>>>> the expired state and the requests from another client, or the server
>>>>>>> reboots.
>>>>>>> 
>>>>>>> The initial implementation of the Courteous Server will do the following:
>>>>>>> 
>>>>>>> . When the laundromat thread detects an expired client and if that client
>>>>>>> still has established state on the Linux server and there is no waiters
>>>>>>> for the client's locks then deletes the client persistent record and marks
>>>>>>> the client as COURTESY_CLIENT and skips destroying the client and all of
>>>>>>> state, otherwise destroys the client as usual.
>>>>>>> 
>>>>>>> . Client persistent record is added to the client database when the
>>>>>>> courtesy client reconnects and transits to normal client.
>>>>>>> 
>>>>>>> . Lock/delegation/share reversation conflict with courtesy client is
>>>>>>> resolved by marking the courtesy client as DESTROY_COURTESY_CLIENT,
>>>>>>> effectively disable it, then allow the current request to proceed
>>>>>>> immediately.
>>>>>>> 
>>>>>>> . Courtesy client marked as DESTROY_COURTESY_CLIENT is not allowed to
>>>>>>> reconnect to reuse itsstate. It is expired by the laundromat asynchronously
>>>>>>> in the background.
>>>>>>> 
>>>>>>> Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx>
>>>>>>> ---
>>>>>>> fs/nfsd/nfs4state.c | 454 +++++++++++++++++++++++++++++++++++++++++++++++-----
>>>>>>> fs/nfsd/state.h     |   5 +
>>>>>>> 2 files changed, 415 insertions(+), 44 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
>>>>>>> index 1956d377d1a6..b302d857e196 100644
>>>>>>> --- a/fs/nfsd/nfs4state.c
>>>>>>> +++ b/fs/nfsd/nfs4state.c
>>>>>>> @@ -1913,14 +1915,37 @@ __find_in_sessionid_hashtbl(struct nfs4_sessionid *sessionid, struct net *net)
>>>>>>> 
>>>>>>> static struct nfsd4_session *
>>>>>>> find_in_sessionid_hashtbl(struct nfs4_sessionid *sessionid, struct net *net,
>>>>>>> -		__be32 *ret)
>>>>>>> +		__be32 *ret, bool *courtesy_clnt)
>>>>>> IMO the new @courtesy_clnt parameter isn't necessary.
>>>>>> Just create a new cl_flag:
>>>>>> 
>>>>>> +#define NFSD4_COURTESY_CLIENT		(6)	/* be nice to expired client */
>>>>>> +#define NFSD4_DESTROY_COURTESY_CLIENT	(7)
>>>>>> 
>>>>>> #define NFSD4_CLIENT_PROMOTE_COURTESY   (8)
>>>>>> 
>>>>>> or REHYDRATE_COURTESY some such.
>>>>>> 
>>>>>> Set that flag and check it once it is safe to call
>>>>>> nfsd4_client_record_create().
>>>>> We need the 'courtesy_clnt' parameter so caller can specify
>>>>> whether the courtesy client should be promoted or not.
>>>> I understand what the flag is used for in the patch, but I
>>>> prefer to see this implemented without changing the synopsis
>>>> of all those functions. Especially adding an output parameter
>>>> like this is usually frowned upon.
>>>> 
>>>> The struct nfs_client can carry this flag, if not in cl_flags,
>>>> then perhaps in another field. That struct is visible in every
>>>> one of the callers.
>>> The struct nfs4_client is not available to the caller of
>>> find_in_sessionid_hashtbl at the time it calls the function and
>>> the current input parameters of find_in_sessionid_hashtbl can
>>> not be used to specify this flag.
>> I see three callers of find_in_sessionid_hashtbl():
>> 
>> - nfsd4_bind_conn_to_session
>> - nfsd4_destroy_session
>> - nfsd4_sequence
>> 
>> In none of these callers is the courtesy_clnt variable set
>> to a true or false value _before_ find_in_sessionid_hashtbl()
>> is called. AFAICT, @courtesy_clnt is an output-only parameter.
> 
> If a caller is interested in the courtesy client, it passes
> in the address of courtesy_clnt and find_in_sessionid_hashtbl
> will take appropriate action and return the result, otherwise
> pass in a NULL.

Dai, please get rid of @courtesy_clnt. All of the callers
can check the returned client's status. If they are not
interested in knowing whether the client needs to be
re-recorded, they can ignore that bit of information.

You need to address this before posting v11. Thanks!


> -Dai
> 
>> 
>> The returned @session::se_client field points to a client
>> that can be examined to see if it has been promoted back to
>> active status.
>> 
>> --
>> Chuck Lever

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux