Re: [PATCH RFC v13 4/4] nfsd: Initial implementation of NFSv4 Courteous Server

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Wed, 16 Feb 2022 16:15:10 +0000

> On Feb 16, 2022, at 4:56 AM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
> 
> On 2/15/22 9:17 AM, Chuck Lever III wrote:
>> 
>>> On Feb 12, 2022, at 1:12 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
>>> 
>>> @@ -3118,6 +3175,14 @@ static __be32 copy_impl_id(struct nfs4_client *clp,
>>> 	return 0;
>>> }
>>> 
>>> +static void
>>> +nfsd4_discard_courtesy_clnt(struct nfs4_client *clp)
>>> +{
>>> +	spin_lock(&clp->cl_cs_lock);
>>> +	set_bit(NFSD4_CLIENT_DESTROY_COURTESY, &clp->cl_flags);
>>> +	spin_unlock(&clp->cl_cs_lock);
>>> +}
>>> +
>>> __be32
>>> nfsd4_exchange_id(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>> 		union nfsd4_op_u *u)
>>> @@ -3195,6 +3260,10 @@ nfsd4_exchange_id(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>>> 	/* Cases below refer to rfc 5661 section 18.35.4: */
>>> 	spin_lock(&nn->client_lock);
>>> 	conf = find_confirmed_client_by_name(&exid->clname, nn);
>>> +	if (conf && test_bit(NFSD4_CLIENT_COURTESY_CLNT, &conf->cl_flags)) {
>>> +		nfsd4_discard_courtesy_clnt(conf);
>>> +		conf = NULL;
>>> +	}
>>> 	if (conf) {
>>> 		bool creds_match = same_creds(&conf->cl_cred, &rqstp->rq_cred);
>>> 		bool verfs_match = same_verf(&verf, &conf->cl_verifier);
>>> @@ -3462,6 +3531,10 @@ nfsd4_create_session(struct svc_rqst *rqstp,
>>> 	spin_lock(&nn->client_lock);
>>> 	unconf = find_unconfirmed_client(&cr_ses->clientid, true, nn);
>>> 	conf = find_confirmed_client(&cr_ses->clientid, true, nn);
>>> +	if (conf && test_bit(NFSD4_CLIENT_COURTESY_CLNT, &conf->cl_flags)) {
>>> +		nfsd4_discard_courtesy_clnt(conf);
>>> +		conf = NULL;
>>> +	}
>> I'm seeing this bit of logic over and over again. I'm wondering
>> why "set_bit(NFSD4_CLIENT_DESTROY_COURTESY, &clp->cl_flags);" cannot
>> be done in the "find_confirmed_yada" functions? The "find" function
>> can even return NULL in that case, so changing all these call sites
>> should be totally unnecessary (except in a couple of cases where I
>> see there is additional logic at the call site).
> 
> This is because not all consumers of find_client_confirm wants to
> discard the courtesy client. The lookup_clientid needs to return the
> courtesy client to its callers because one of the callers needs to
> transit the courtesy client to an active client.

Since find_confirmed_client() is a small function, I would
create a patch that refactors lookup_client() to pull the
existing find_confirmed_client() into that. Apply that patch
first. Then the big patch can change find_confirmed_client()
to set NFSD4_CLIENT_DESTROY_COURTESY.

What about the other find_confirmed_* functions?

>>> +		 */
>>> +		if (!cour) {
>>> +			set_bit(NFSD4_CLIENT_COURTESY, &clp->cl_flags);
>>> +			clp->courtesy_client_expiry = ktime_get_boottime_seconds() +
>>> +					NFSD_COURTESY_CLIENT_TIMEOUT;
>>> +			list_add(&clp->cl_cs_list, &cslist);
>> Can cl_lru (or some other existing list_head field)
>> be used instead of cl_cs_list?
> 
> The cl_lru is used for clients to be expired, the cl_cs_list
> is used for courtesy clients and they are treated differently.

Understood, but cl_lru is not a list. It's a field that is
used to attach an nfs4_client _to_ a list.

You should be able to use cl_lru here if the nfs4_client is
going to be added to either reaplist or cslist but not both.

>> I don't see anywhere that removes clp from cslist when
>> this processing is complete. Seems like you will get
>> list corruption next time the laundromat looks at
>> its list of nfs4_clients.
> 
> We re-initialize the list head before every time the laundromat
> runs so there is no need to remove the entries once we're done.

Re-initializing cslist does not change the membership
of the list that was just constructed, it simply orphans
the list. Next time the code does a list_add(&clp->cl_cs_list)
that list will still be there and the nfs4_client will still
be on it.

The nfs4_client has to be explicitly removed from cslist
before the function completes. Otherwise, cl_cs_list
will link those nfs4_client objects to garbage, and the
next time nfs4_get_client_reaplist() is called, that
list_for_each_entry() will walk off the end of the previous
(now phantom) list that the cl_cs_list is still linked to.

Please ensure that there is a "list_del();" somewhere
before the function exits and cslist vanishes. You could,
for example, replace the list_for_each_entry() with a

    while(!list_empty(&cslist)) {
	list_del(&clp->cl_cs_list /* or cl_lru */ );
	...
    }

>>> +/**
>>> + * nfsd4_fl_lock_expired - check if lock conflict can be resolved.
>>> + *
>>> + * @fl: pointer to file_lock with a potential conflict
>>> + * Return values:
>>> + *   %false: real conflict, lock conflict can not be resolved.
>>> + *   %true: no conflict, lock conflict was resolved.
>>> + *
>>> + * Note that this function is called while the flc_lock is held.
>>> + */
>>> +static bool
>>> +nfsd4_fl_lock_expired(struct file_lock *fl)
>> I'd prefer this guy to be named like the newer lm_ functions,
>> not the old fl_ functions. So: nfsd4_lm_lock_expired()
> 
> This is a bit messy:
> 
> static const struct lock_manager_operations nfsd_posix_mng_ops  = {
>        .lm_notify = nfsd4_lm_notify,
>        .lm_get_owner = nfsd4_fl_get_owner,
>        .lm_put_owner = nfsd4_fl_put_owner,
>        .lm_lock_expired = nfsd4_fl_lock_expired,
> };
> 
> Most NFS callbacks are named nfsd4_fl_xx and one as
> nfsd4_fl_lock_expired.

The existing lm_notify callback name is correct as it
stands: nfsd4_lm_notify.

> I will change nfsd4_fl_lock_expired to
> nfsd4_lm_lock_expired as suggested but note this inconsistency
> is still there.

The usual practice is to name the function instances
the same as the method names. aef9583b234a ("NFSD: Get
reference of lockowner when coping file_lock") missed
this -- the middle two should both be nfsd4_lm_yada.

I will add a patch to rename these two before you
rebase for v14.

>>> +{
>>> +	struct nfs4_lockowner *lo;
>>> +	struct nfs4_client *clp;
>>> +	bool rc = false;
>>> +
>>> +	if (!fl)
>>> +		return false;
>>> +	lo = (struct nfs4_lockowner *)fl->fl_owner;
>>> +	clp = lo->lo_owner.so_client;
>>> +
>>> +	/* need to sync with courtesy client trying to reconnect */
>>> +	spin_lock(&clp->cl_cs_lock);
>>> +	if (test_bit(NFSD4_CLIENT_DESTROY_COURTESY, &clp->cl_flags))
>>> +		rc = true;
>>> +	else {
>>> +		if (test_bit(NFSD4_CLIENT_COURTESY, &clp->cl_flags)) {
>>> +			set_bit(NFSD4_CLIENT_DESTROY_COURTESY, &clp->cl_flags);
>>> +			rc =  true;
>>> +		} else
>>> +			rc =  false;
>> Couldn't you write it this way instead:
>> 
>> 	if (test_bit(NFSD4_CLIENT_COURTESY, &clp->cl_flags))
>> 		set_bit(NFSD4_CLIENT_DESTROY_COURTESY, &clp->cl_flags);
>> 	rc = !!test_bit(NFSD4_CLIENT_DESTROY_COURTESY, &clp->cl_flags);
>> 
>> This is more a check to see whether I understand what's
>> going on rather than a request to change the patch.
> 
> I think it works the same. Every time I see a '!!' it gives me
> a headache :-)

Indeed, it takes some getting used to.

>>> +	}
>>> +	spin_unlock(&clp->cl_cs_lock);
>>> +	return rc;
>>> +}
>>> +
>>> static fl_owner_t
>>> nfsd4_fl_get_owner(fl_owner_t owner)
>>> {
>>> @@ -6572,6 +6965,7 @@ static const struct lock_manager_operations nfsd_posix_mng_ops  = {
>>> 	.lm_notify = nfsd4_lm_notify,
>>> 	.lm_get_owner = nfsd4_fl_get_owner,
>>> 	.lm_put_owner = nfsd4_fl_put_owner,
>>> +	.lm_lock_expired = nfsd4_fl_lock_expired,
>>> };
>>> 
>>> static inline void
>>> diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
>>> index 3e5008b475ff..920fad00e2e4 100644
>>> --- a/fs/nfsd/nfsd.h
>>> +++ b/fs/nfsd/nfsd.h
>>> @@ -336,6 +336,7 @@ void		nfsd_lockd_shutdown(void);
>>> #define COMPOUND_ERR_SLACK_SPACE	16     /* OP_SETATTR */
>>> 
>>> #define NFSD_LAUNDROMAT_MINTIMEOUT      1   /* seconds */
>>> +#define	NFSD_COURTESY_CLIENT_TIMEOUT	(24 * 60 * 60)	/* seconds */
>>> 
>>> /*
>>>  * The following attributes are currently not supported by the NFSv4 server:
>>> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
>>> index 95457cfd37fc..80e565593d83 100644
>>> --- a/fs/nfsd/state.h
>>> +++ b/fs/nfsd/state.h
>>> @@ -345,6 +345,9 @@ struct nfs4_client {
>>> #define NFSD4_CLIENT_UPCALL_LOCK	(5)	/* upcall serialization */
>>> #define NFSD4_CLIENT_CB_FLAG_MASK	(1 << NFSD4_CLIENT_CB_UPDATE | \
>>> 					 1 << NFSD4_CLIENT_CB_KILL)
>>> +#define NFSD4_CLIENT_COURTESY		(6)	/* be nice to expired client */
>> The comment is a little obtuse. If the client is
>> actually expired, then it will be ignored and
>> destroyed. Maybe "client is unreachable" ?
> 
> I think "client is unreachable" is not precise and kind of
> confusing so unless you insists I'd like to keep it this way
> or just removing it.

I think we have to be careful with the terminology.

An expired client is one the server is going to destroy,
and courtesy clients are rather going to be spared. The
whole point of this work is that the server is _not_ going
to expire the client's lease. Calling it an expired client
here is contrary to that intention.

Unless you can think of a concise way to state that in the
comment, let's just remove it.

>>> +#define NFSD4_CLIENT_DESTROY_COURTESY	(7)
>> Maybe NFSD4_CLIENT_EXPIRE_COURTESY ? Dunno.
> 
> Unless you, or other reviewers, insist I'd like to keep it this way.

I think NFSD4_CLIENT_EXPIRED, actually, is going to make
the test_bit()s in fs/nfsd/nfs4state.c more easy to
understand. The transition is courtesy -> expired, right?
And as you say below, the mainline logic has to decide what
to do with one of these clients -- it might not immediately
destroy it, but instead might just want to ignore the
nfs4_client (for example, during lock conflict resolution).

Try NFSD4_CLIENT_EXPIRED. If it's awful we can switch back.
"It's only ones and zeroes."

>>> +#define NFSD4_CLIENT_COURTESY_CLNT	(8)	/* used for lookup clientid/name */
>> The name CLIENT_COURTESY_CLNT doesn't make sense to me
>> when it appears in context. The comment doesn't clarify
>> it either. May I suggest:
>> 
>> #define NFSD4_CLIENT_RENEW_COURTESY	(8)	/* courtesy -> active */
> 
> The NFSD4_CLIENT_COURTESY_CLNT flag does not mean this courtesy
> client will always transit to active client. The flag is used to
> indicate this was a courtesy client and it's up to the caller to
> take appropriate action; destroy it or create client record before
> using it.

I get it: The flag names should reflect a state, not a
requested action.

But I still find CLIENT_COURTESY_CLNT to be unhelpfully
obscure. How about NFSD4_CLIENT_RECONNECTED ?

--
Chuck Lever