Re: [PATCH v4 2/2] NFSD: add shrinker to reap courtesy clients on low memory condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 9/1/22 9:32 PM, Chuck Lever III wrote:

On Sep 1, 2022, at 9:56 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:

Hi Chuck,

On 8/31/22 7:30 AM, Chuck Lever III wrote:
	struct list_head *pos, *next;
	struct nfs4_client *clp;

-	maxreap = (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients) ?
-			NFSD_CLIENT_MAX_TRIM_PER_RUN : 0;
+	cb_cnt = atomic_read(&nn->nfsd_client_shrinker_cb_count);
+	if (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients ||
+							cb_cnt) {
+		maxreap = NFSD_CLIENT_MAX_TRIM_PER_RUN;
+		atomic_set(&nn->nfsd_client_shrinker_cb_count, 0);
+	}
I'm not terribly happy with this, but I don't have a better suggestion
at the moment. Let me think about it.
Do you have any suggestion to improve this, I want to incorporate it
before sending out v5?
Let's consider some broad outlines...

With regards to parametrizing the reaplist behavior, you want
a normal laundromat run to reap zero or more courtesy clients.
You want a shrinker laundromat run to reap more than zero. I
think you want a minreap variable as well as a maxreap variable
in there to control how the reaplist is built. Making @minreap
a function parameter rather than a global atomic would be a
plus for me, but maybe that's not practical.

I'm not quite understand how the minreap is used, I think it
will make the code more complex.


But I would prefer a more straightforward approach overall. The
proposed approach seems tricky and brittle, and needs a lot of
explaining to understand. Other subsystems seem to get away with
something simpler.

Can nfsd_courtesy_client_count() simply reduce
nn->nfs4_max_clients, kick the laundromat, and then return 0?
Then get rid of nfsd_courtesy_client_scan().

I need to think more about this approach. However at first glance,
nn->nfs4_max_clients is used to control how many clients, including
active and courtesy clients, are allowed in the system. If we lower
this count, it also prevent new clients from connecting to the
system. So now the shrinker mechanism does more than just getting
rid of unused resources, maybe that's ok?


Or, nfsd_courtesy_client_count() could return
nfsd_couresy_client_count. nfsd_courtesy_client_scan() could
then look something like this:

	if ((sc->gfp_mask & GFP_KERNEL) != GFP_KERNEL)
		return SHRINK_STOP;

	nfsd_get_client_reaplist(nn, reaplist, lt);
	list_for_each_safe(pos, next, &reaplist) {
		clp = list_entry(pos, struct nfs4_client, cl_lru);
		trace_nfsd_clid_purged(&clp->cl_clientid);
		list_del_init(&clp->cl_lru);
		expire_client(clp);
		count++;
	}
	return count;

This does not work, we cannot expire clients on the context of
scan callback due to deadlock problem.

I will experiment with ways to get rid of the scan function to
make the logic simpler.

Thanks,
-Dai


Obviously you would need to refactor common code into helper
functions.

--
Chuck Lever




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux