Re: [PATCH RFC v14 09/10] NFSD: Update laundromat to handle courtesy clients

Chuck Lever III <chuck.lever@xxxxxxxxxx> · Fri, 25 Feb 2022 17:29:22 +0000

> On Feb 23, 2022, at 1:16 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
> 
> Add nfs4_anylock_blocker to check if an expired client has any
> blockers.
> 
> Update nfs4_get_client_reaplist to:
> . add discarded courtesy client; client marked with CLIENT_EXPIRED,
>   to reaplist.
> . detect if expired client still has state and no blockers then
>   transit it to courtesy client by setting CLIENT_COURTESY flag
>   and removing the client record.

Thanks for breaking up the v13 patch and adding these nice
patch descriptions !

> Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx>
> ---
> fs/nfsd/nfs4state.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 98 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 01c51adf4873..282b8f040540 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -5821,24 +5821,120 @@ static void nfsd4_ssc_expire_umount(struct nfsd_net *nn)
> }
> #endif
> 
> +static bool
> +nfs4_anylock_blocker(struct nfs4_client *clp)
> +{
> +	int i;
> +	struct nfs4_stateowner *so, *tmp;
> +	struct nfs4_lockowner *lo;
> +	struct nfs4_ol_stateid *stp;
> +	struct nfs4_file *nf;
> +	struct inode *ino;
> +	struct file_lock_context *ctx;
> +	struct file_lock *fl;
> +
> +	spin_lock(&clp->cl_lock);
> +	for (i = 0; i < OWNER_HASH_SIZE; i++) {
> +		/* scan each lock owner */
> +		list_for_each_entry_safe(so, tmp, &clp->cl_ownerstr_hashtbl[i],
> +				so_strhash) {

You're holding the cl_lock and not changing the hash bucket
list inside the loop, so list_for_each_entry() is sufficient
here. _safe is not needed.

> +			if (so->so_is_open_owner)
> +				continue;
> +
> +			/* scan lock states of this lock owner */
> +			lo = lockowner(so);
> +			list_for_each_entry(stp, &lo->lo_owner.so_stateids,
> +					st_perstateowner) {
> +				nf = stp->st_stid.sc_file;
> +				ino = nf->fi_inode;
> +				ctx = ino->i_flctx;
> +				if (!ctx)
> +					continue;
> +				/* check each lock belongs to this lock state */
> +				list_for_each_entry(fl, &ctx->flc_posix, fl_list) {
> +					if (fl->fl_owner != lo)
> +						continue;
> +					if (!list_empty(&fl->fl_blocked_requests)) {
> +						spin_unlock(&clp->cl_lock);
> +						return true;
> +					}
> +				}

IIUC fl_blocked_requests is internal to fs/locks.c, so this
little piece needs to be moved there. Jeff can give a little
more guidance on function naming, etc.

Have a look at Section 6 of Documentation/process/coding-style.rst
for some hints about what to do with long functions. This one is
long and indented far to the right: that is indication that it
should be broken into smaller helpers with names that make it
clear what each step does -- no comments needed.

So for legibility, "check each lock belongs..." is a function that
should be moved to fs/locks.c, as I mentioned above; and "scan lock
state" can be its own function that calls "check each lock belongs".

I'm not sure what the comment "check each lock belongs to this lock
state" means. I thought we were looking at blocked lists, not locks,
in this search?

> +			}
> +		}
> +	}
> +	spin_unlock(&clp->cl_lock);
> +	return false;
> +}
> +
> static void
> nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
> 				struct laundry_time *lt)
> {
> 	struct list_head *pos, *next;
> 	struct nfs4_client *clp;
> +	bool cour;
> +	struct list_head cslist;
> 
> 	INIT_LIST_HEAD(reaplist);
> +	INIT_LIST_HEAD(&cslist);
> 	spin_lock(&nn->client_lock);
> 	list_for_each_safe(pos, next, &nn->client_lru) {
> 		clp = list_entry(pos, struct nfs4_client, cl_lru);
> 		if (!state_expired(lt, clp->cl_time))
> 			break;
> -		if (mark_client_expired_locked(clp))
> +
> +		/* client expired */
> +		if (!client_has_state(clp)) {
> +			if (mark_client_expired_locked(clp))
> +				continue;
> +			list_add(&clp->cl_lru, reaplist);
> +			continue;

The above looks like the same thing that is done below at
the "exp_client:" label. Should you replace this with

	if (!client_has_state(clp))
		goto exp_client;

?

Then I think the one-line comments would be unnecessary.
The code would be self-explanatory.

> +		}
> +
> +		/* expired client has state */
> +		if (test_bit(NFSD4_CLIENT_EXPIRED, &clp->cl_flags))
> +			goto exp_client;
> +		cour = test_bit(NFSD4_CLIENT_COURTESY, &clp->cl_flags);
> +		if (cour && ktime_get_boottime_seconds() >=
> +				(clp->cl_time + NFSD_COURTESY_CLIENT_TIMEOUT))
> +			goto exp_client;
> +		if (nfs4_anylock_blocker(clp)) {
> +			/* expired client has state and has blocker. */
> +exp_client:
> +			if (mark_client_expired_locked(clp))
> +				continue;
> +			list_add(&clp->cl_lru, reaplist);
> 			continue;
> -		list_add(&clp->cl_lru, reaplist);
> +		}
> +		/*
> +		 * Client expired and has state and has no blockers.
> +		 * If there is race condition with blockers, next time
> +		 * the laundromat runs it will catch it and expires
> +		 * the client.

This comment is still not clear. "Client expired ... it will catch
and expires the client." Do you mean "destroys the client" ?

> +		 */
> +		if (!cour) {
> +			set_bit(NFSD4_CLIENT_COURTESY, &clp->cl_flags);
> +			list_add(&clp->cl_cs_list, &cslist);
> +		}

cl_cs_lock is taken in the loop below, but don't you also need to
hold that lock when manipulating the courtesy state bits in the
code above?

> 	}
> 	spin_unlock(&nn->client_lock);
> +
> +	while (!list_empty(&cslist)) {
> +		clp = list_first_entry(&cslist, struct nfs4_client, cl_cs_list);
> +		list_del_init(&clp->cl_cs_list);
> +		spin_lock(&clp->cl_cs_lock);
> +		/*
> +		 * Client might have re-connected. Make sure it's
> +		 * still in courtesy state before removing its record.
> +		 */
> +		if (test_bit(NFSD4_CLIENT_EXPIRED, &clp->cl_flags) ||
> +			!test_bit(NFSD4_CLIENT_COURTESY, &clp->cl_flags)) {
> +			spin_unlock(&clp->cl_cs_lock);
> +			continue;
> +		}
> +		spin_unlock(&clp->cl_cs_lock);
> +		nfsd4_client_record_remove(clp);
> +	}
> }
> 
> static time64_t
> -- 
> 2.9.5
> 

--
Chuck Lever