Re: [PATCH v2 2/2] NFSD: add shrinker to reap courtesy clients on low memory condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2022-08-29 at 11:25 -0700, dai.ngo@xxxxxxxxxx wrote:
> On 8/29/22 10:15 AM, Jeff Layton wrote:
> > On Sun, 2022-08-28 at 17:47 -0700, Dai Ngo wrote:
> > > Add the courtesy client shrinker to react to low memory condition
> > > triggered by the memory shrinker.
> > > 
> > > On the shrinker's count callback, we increment a callback counter
> > > and return the number of outstanding courtesy clients. When the
> > > laundromat runs, it checks if this counter is not zero and starts
> > > reaping old courtesy clients. The maximum number of clients to be
> > > reaped is limited to NFSD_CIENT_MAX_TRIM_PER_RUN (128). This limit
> > > is to prevent the laundromat from spending too much time reaping
> > > the clients and not processing other tasks in a timely manner.
> > > 
> > > The laundromat is rescheduled to run sooner if it detects low
> > > low memory condition and there are more clients to reap.
> > > 
> > > On the shrinker's scan callback, we return the number of clients
> > > That were reaped since the last scan callback. We can not reap
> > > the clients on the scan callback context since destroying the
> > > client might require call into the underlying filesystem or other
> > > subsystems which might allocate memory which can cause deadlock.
> > > 
> > > Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx>
> > > ---
> > >   fs/nfsd/netns.h     |  3 +++
> > >   fs/nfsd/nfs4state.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++----
> > >   fs/nfsd/nfsctl.c    |  6 ++++--
> > >   fs/nfsd/nfsd.h      |  9 +++++++--
> > >   4 files changed, 61 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > > index 2695dff1378a..2a604951623f 100644
> > > --- a/fs/nfsd/netns.h
> > > +++ b/fs/nfsd/netns.h
> > > @@ -194,6 +194,9 @@ struct nfsd_net {
> > >   	int			nfs4_max_clients;
> > >   
> > >   	atomic_t		nfsd_courtesy_client_count;
> > > +	atomic_t		nfsd_client_shrinker_cb_count;
> > > +	atomic_t		nfsd_client_shrinker_reapcount;
> > > +	struct shrinker		nfsd_client_shrinker;
> > >   };
> > >   
> > >   /* Simple check to find out if a given net was properly initialized */
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 3d8d7ebb5b91..9d5a20f0c3c4 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -4341,7 +4341,39 @@ nfsd4_init_slabs(void)
> > >   	return -ENOMEM;
> > >   }
> > >   
> > > -void nfsd4_init_leases_net(struct nfsd_net *nn)
> > > +static unsigned long
> > > +nfsd_courtesy_client_count(struct shrinker *shrink, struct shrink_control *sc)
> > > +{
> > > +	struct nfsd_net *nn = container_of(shrink,
> > > +			struct nfsd_net, nfsd_client_shrinker);
> > > +
> > > +	atomic_inc(&nn->nfsd_client_shrinker_cb_count);
> > > +	return (unsigned long)atomic_read(&nn->nfsd_courtesy_client_count);
> > > +}
> > > +
> > > +static unsigned long
> > > +nfsd_courtesy_client_scan(struct shrinker *shrink, struct shrink_control *sc)
> > > +{
> > > +	struct nfsd_net *nn = container_of(shrink,
> > > +			struct nfsd_net, nfsd_client_shrinker);
> > > +	unsigned long cnt;
> > > +
> > > +	cnt = atomic_read(&nn->nfsd_client_shrinker_reapcount);
> > > +	atomic_set(&nn->nfsd_client_shrinker_reapcount, 0);
> > Is it legit to return that we freed these objects when it hasn't
> > actually been done yet? Maybe this should always return 0? I'm not sure
> > what the rules are with shrinkers.
> 
> nfsd_client_shrinker_reapcount is the actual number of clients that
> the laudromat was able to destroy in its last run.
> 
> > 
> > Either way, it seems like "scan" should cue the laundromat to run ASAP.
> > When this is called, it may be quite some time before the laundromat
> > runs again. Currently, it's always just scheduled it out to when we know
> > there may be work to be done, but this is a different situation.
> 
> Normally the "scan" callback is used to free unused resources and return
> the number of objects freed. For the NFSv4 clients, we can not free the
> clients on the "scan" context due to potential deadlock and also the
> laundromat might block while destroying the clients. Because of this we
> use the "count" callback to notify the laundromat of the low memory
> condition.
> 
> In the "count" callback we do not call mod_delayed_work to kick start
> the laundromat since we do not want to rely on the inner working
> mod_delayed_work to guarantee no deadlock. I also think we should do
> the minimal while on the memory shrinker's callback context.
> 

Are you aware of a specific problem with shrinkers and queueing work?

As I understand it, shrinkers are run in the context of an allocation.
An allocation crosses a threshold of some sort and and we start trying
to free stuff using shrinkers. queueing delayed work will never allocate
anything, so I would think it'd be safe to do in the count callback.

> Once the laundromat runs it monitors the low memory condition and
> reschedules itself to run immediately (NFSD_LAUNDROMAT_MINTIMEOUT) if
> needed.
> 
> Thanks,
> -Dai
> 
> > 
> > > +	return cnt;
> > > +}
> > > +
> > > +static int
> > > +nfsd_register_client_shrinker(struct nfsd_net *nn)
> > > +{
> > > +	nn->nfsd_client_shrinker.scan_objects = nfsd_courtesy_client_scan;
> > > +	nn->nfsd_client_shrinker.count_objects = nfsd_courtesy_client_count;
> > > +	nn->nfsd_client_shrinker.seeks = DEFAULT_SEEKS;
> > > +	return register_shrinker(&nn->nfsd_client_shrinker, "nfsd-client");
> > > +}
> > > +
> > > +int
> > > +nfsd4_init_leases_net(struct nfsd_net *nn)
> > >   {
> > >   	struct sysinfo si;
> > >   	u64 max_clients;
> > > @@ -4362,6 +4394,8 @@ void nfsd4_init_leases_net(struct nfsd_net *nn)
> > >   	nn->nfs4_max_clients = max_t(int, max_clients, NFS4_CLIENTS_PER_GB);
> > >   
> > >   	atomic_set(&nn->nfsd_courtesy_client_count, 0);
> > > +	atomic_set(&nn->nfsd_client_shrinker_cb_count, 0);
> > > +	return nfsd_register_client_shrinker(nn);
> > >   }
> > >   
> > >   static void init_nfs4_replay(struct nfs4_replay *rp)
> > > @@ -5870,12 +5904,17 @@ static void
> > >   nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
> > >   				struct laundry_time *lt)
> > >   {
> > > -	unsigned int oldstate, maxreap, reapcnt = 0;
> > > +	unsigned int oldstate, maxreap = 0, reapcnt = 0;
> > > +	int cb_cnt;
> > >   	struct list_head *pos, *next;
> > >   	struct nfs4_client *clp;
> > >   
> > > -	maxreap = (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients) ?
> > > -			NFSD_CLIENT_MAX_TRIM_PER_RUN : 0;
> > > +	cb_cnt = atomic_read(&nn->nfsd_client_shrinker_cb_count);
> > > +	if (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients ||
> > > +							cb_cnt) {
> > > +		maxreap = NFSD_CLIENT_MAX_TRIM_PER_RUN;
> > > +		atomic_set(&nn->nfsd_client_shrinker_cb_count, 0);
> > > +	}
> > >   	INIT_LIST_HEAD(reaplist);
> > >   	spin_lock(&nn->client_lock);
> > >   	list_for_each_safe(pos, next, &nn->client_lru) {
> > > @@ -5902,6 +5941,8 @@ nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist,
> > >   		}
> > >   	}
> > >   	spin_unlock(&nn->client_lock);
> > > +	if (cb_cnt)
> > > +		atomic_add(reapcnt, &nn->nfsd_client_shrinker_reapcount);
> > >   }
> > >   
> > >   static time64_t
> > > @@ -5942,6 +5983,8 @@ nfs4_laundromat(struct nfsd_net *nn)
> > >   		list_del_init(&clp->cl_lru);
> > >   		expire_client(clp);
> > >   	}
> > > +	if (atomic_read(&nn->nfsd_client_shrinker_cb_count) > 0)
> > > +		lt.new_timeo = NFSD_LAUNDROMAT_MINTIMEOUT;
> > >   	spin_lock(&state_lock);
> > >   	list_for_each_safe(pos, next, &nn->del_recall_lru) {
> > >   		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > index 917fa1892fd2..597a26ad4183 100644
> > > --- a/fs/nfsd/nfsctl.c
> > > +++ b/fs/nfsd/nfsctl.c
> > > @@ -1481,11 +1481,12 @@ static __net_init int nfsd_init_net(struct net *net)
> > >   		goto out_idmap_error;
> > >   	nn->nfsd_versions = NULL;
> > >   	nn->nfsd4_minorversions = NULL;
> > > +	retval = nfsd4_init_leases_net(nn);
> > > +	if (retval)
> > > +		goto out_drc_error;
> > >   	retval = nfsd_reply_cache_init(nn);
> > >   	if (retval)
> > >   		goto out_drc_error;
> > > -	nfsd4_init_leases_net(nn);
> > > -
> > >   	get_random_bytes(&nn->siphash_key, sizeof(nn->siphash_key));
> > >   	seqlock_init(&nn->writeverf_lock);
> > >   
> > > @@ -1507,6 +1508,7 @@ static __net_exit void nfsd_exit_net(struct net *net)
> > >   	nfsd_idmap_shutdown(net);
> > >   	nfsd_export_shutdown(net);
> > >   	nfsd_netns_free_versions(net_generic(net, nfsd_net_id));
> > > +	nfsd4_leases_net_shutdown(nn);
> > >   }
> > >   
> > >   static struct pernet_operations nfsd_net_ops = {
> > > diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h
> > > index 57a468ed85c3..7e05ab7a3532 100644
> > > --- a/fs/nfsd/nfsd.h
> > > +++ b/fs/nfsd/nfsd.h
> > > @@ -498,7 +498,11 @@ extern void unregister_cld_notifier(void);
> > >   extern void nfsd4_ssc_init_umount_work(struct nfsd_net *nn);
> > >   #endif
> > >   
> > > -extern void nfsd4_init_leases_net(struct nfsd_net *nn);
> > > +extern int nfsd4_init_leases_net(struct nfsd_net *nn);
> > > +static inline void nfsd4_leases_net_shutdown(struct nfsd_net *nn)
> > > +{
> > > +	unregister_shrinker(&nn->nfsd_client_shrinker);
> > > +};
> > >   
> > >   #else /* CONFIG_NFSD_V4 */
> > >   static inline int nfsd4_is_junction(struct dentry *dentry)
> > > @@ -506,7 +510,8 @@ static inline int nfsd4_is_junction(struct dentry *dentry)
> > >   	return 0;
> > >   }
> > >   
> > > -static inline void nfsd4_init_leases_net(struct nfsd_net *nn) {};
> > > +static inline int nfsd4_init_leases_net(struct nfsd_net *nn) { return 0; };
> > > +static inline void nfsd4_leases_net_shutdown(struct nfsd_net *nn) { };
> > >   
> > >   #define register_cld_notifier() 0
> > >   #define unregister_cld_notifier() do { } while(0)

-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux