On Mon, 2022-08-29 at 11:25 -0700, dai.ngo@xxxxxxxxxx wrote: > On 8/29/22 10:15 AM, Jeff Layton wrote: > > On Sun, 2022-08-28 at 17:47 -0700, Dai Ngo wrote: > > > Add the courtesy client shrinker to react to low memory condition > > > triggered by the memory shrinker. > > > > > > On the shrinker's count callback, we increment a callback counter > > > and return the number of outstanding courtesy clients. When the > > > laundromat runs, it checks if this counter is not zero and starts > > > reaping old courtesy clients. The maximum number of clients to be > > > reaped is limited to NFSD_CIENT_MAX_TRIM_PER_RUN (128). This limit > > > is to prevent the laundromat from spending too much time reaping > > > the clients and not processing other tasks in a timely manner. > > > > > > The laundromat is rescheduled to run sooner if it detects low > > > low memory condition and there are more clients to reap. > > > > > > On the shrinker's scan callback, we return the number of clients > > > That were reaped since the last scan callback. We can not reap > > > the clients on the scan callback context since destroying the > > > client might require call into the underlying filesystem or other > > > subsystems which might allocate memory which can cause deadlock. > > > > > > Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx> > > > --- > > > fs/nfsd/netns.h | 3 +++ > > > fs/nfsd/nfs4state.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++---- > > > fs/nfsd/nfsctl.c | 6 ++++-- > > > fs/nfsd/nfsd.h | 9 +++++++-- > > > 4 files changed, 61 insertions(+), 8 deletions(-) > > > > > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h > > > index 2695dff1378a..2a604951623f 100644 > > > --- a/fs/nfsd/netns.h > > > +++ b/fs/nfsd/netns.h > > > @@ -194,6 +194,9 @@ struct nfsd_net { > > > int nfs4_max_clients; > > > > > > atomic_t nfsd_courtesy_client_count; > > > + atomic_t nfsd_client_shrinker_cb_count; > > > + atomic_t nfsd_client_shrinker_reapcount; > > > + struct shrinker nfsd_client_shrinker; > > > }; > > > > > > /* Simple check to find out if a given net was properly initialized */ > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > index 3d8d7ebb5b91..9d5a20f0c3c4 100644 > > > --- a/fs/nfsd/nfs4state.c > > > +++ b/fs/nfsd/nfs4state.c > > > @@ -4341,7 +4341,39 @@ nfsd4_init_slabs(void) > > > return -ENOMEM; > > > } > > > > > > -void nfsd4_init_leases_net(struct nfsd_net *nn) > > > +static unsigned long > > > +nfsd_courtesy_client_count(struct shrinker *shrink, struct shrink_control *sc) > > > +{ > > > + struct nfsd_net *nn = container_of(shrink, > > > + struct nfsd_net, nfsd_client_shrinker); > > > + > > > + atomic_inc(&nn->nfsd_client_shrinker_cb_count); > > > + return (unsigned long)atomic_read(&nn->nfsd_courtesy_client_count); > > > +} > > > + > > > +static unsigned long > > > +nfsd_courtesy_client_scan(struct shrinker *shrink, struct shrink_control *sc) > > > +{ > > > + struct nfsd_net *nn = container_of(shrink, > > > + struct nfsd_net, nfsd_client_shrinker); > > > + unsigned long cnt; > > > + > > > + cnt = atomic_read(&nn->nfsd_client_shrinker_reapcount); > > > + atomic_set(&nn->nfsd_client_shrinker_reapcount, 0); > > Is it legit to return that we freed these objects when it hasn't > > actually been done yet? Maybe this should always return 0? I'm not sure > > what the rules are with shrinkers. > > nfsd_client_shrinker_reapcount is the actual number of clients that > the laudromat was able to destroy in its last run. > > > > > Either way, it seems like "scan" should cue the laundromat to run ASAP. > > When this is called, it may be quite some time before the laundromat > > runs again. Currently, it's always just scheduled it out to when we know > > there may be work to be done, but this is a different situation. > > Normally the "scan" callback is used to free unused resources and return > the number of objects freed. For the NFSv4 clients, we can not free the > clients on the "scan" context due to potential deadlock and also the > laundromat might block while destroying the clients. Because of this we > use the "count" callback to notify the laundromat of the low memory > condition. > > In the "count" callback we do not call mod_delayed_work to kick start > the laundromat since we do not want to rely on the inner working > mod_delayed_work to guarantee no deadlock. I also think we should do > the minimal while on the memory shrinker's callback context. > Are you aware of a specific problem with shrinkers and queueing work? As I understand it, shrinkers are run in the context of an allocation. An allocation crosses a threshold of some sort and and we start trying to free stuff using shrinkers. queueing delayed work will never allocate anything, so I would think it'd be safe to do in the count callback. > Once the laundromat runs it monitors the low memory condition and > reschedules itself to run immediately (NFSD_LAUNDROMAT_MINTIMEOUT) if > needed. > > Thanks, > -Dai > > > > > > + return cnt; > > > +} > > > + > > > +static int > > > +nfsd_register_client_shrinker(struct nfsd_net *nn) > > > +{ > > > + nn->nfsd_client_shrinker.scan_objects = nfsd_courtesy_client_scan; > > > + nn->nfsd_client_shrinker.count_objects = nfsd_courtesy_client_count; > > > + nn->nfsd_client_shrinker.seeks = DEFAULT_SEEKS; > > > + return register_shrinker(&nn->nfsd_client_shrinker, "nfsd-client"); > > > +} > > > + > > > +int > > > +nfsd4_init_leases_net(struct nfsd_net *nn) > > > { > > > struct sysinfo si; > > > u64 max_clients; > > > @@ -4362,6 +4394,8 @@ void nfsd4_init_leases_net(struct nfsd_net *nn) > > > nn->nfs4_max_clients = max_t(int, max_clients, NFS4_CLIENTS_PER_GB); > > > > > > atomic_set(&nn->nfsd_courtesy_client_count, 0); > > > + atomic_set(&nn->nfsd_client_shrinker_cb_count, 0); > > > + return nfsd_register_client_shrinker(nn); > > > } > > > > > > static void init_nfs4_replay(struct nfs4_replay *rp) > > > @@ -5870,12 +5904,17 @@ static void > > > nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist, > > > struct laundry_time *lt) > > > { > > > - unsigned int oldstate, maxreap, reapcnt = 0; > > > + unsigned int oldstate, maxreap = 0, reapcnt = 0; > > > + int cb_cnt; > > > struct list_head *pos, *next; > > > struct nfs4_client *clp; > > > > > > - maxreap = (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients) ? > > > - NFSD_CLIENT_MAX_TRIM_PER_RUN : 0; > > > + cb_cnt = atomic_read(&nn->nfsd_client_shrinker_cb_count); > > > + if (atomic_read(&nn->nfs4_client_count) >= nn->nfs4_max_clients || > > > + cb_cnt) { > > > + maxreap = NFSD_CLIENT_MAX_TRIM_PER_RUN; > > > + atomic_set(&nn->nfsd_client_shrinker_cb_count, 0); > > > + } > > > INIT_LIST_HEAD(reaplist); > > > spin_lock(&nn->client_lock); > > > list_for_each_safe(pos, next, &nn->client_lru) { > > > @@ -5902,6 +5941,8 @@ nfs4_get_client_reaplist(struct nfsd_net *nn, struct list_head *reaplist, > > > } > > > } > > > spin_unlock(&nn->client_lock); > > > + if (cb_cnt) > > > + atomic_add(reapcnt, &nn->nfsd_client_shrinker_reapcount); > > > } > > > > > > static time64_t > > > @@ -5942,6 +5983,8 @@ nfs4_laundromat(struct nfsd_net *nn) > > > list_del_init(&clp->cl_lru); > > > expire_client(clp); > > > } > > > + if (atomic_read(&nn->nfsd_client_shrinker_cb_count) > 0) > > > + lt.new_timeo = NFSD_LAUNDROMAT_MINTIMEOUT; > > > spin_lock(&state_lock); > > > list_for_each_safe(pos, next, &nn->del_recall_lru) { > > > dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru); > > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c > > > index 917fa1892fd2..597a26ad4183 100644 > > > --- a/fs/nfsd/nfsctl.c > > > +++ b/fs/nfsd/nfsctl.c > > > @@ -1481,11 +1481,12 @@ static __net_init int nfsd_init_net(struct net *net) > > > goto out_idmap_error; > > > nn->nfsd_versions = NULL; > > > nn->nfsd4_minorversions = NULL; > > > + retval = nfsd4_init_leases_net(nn); > > > + if (retval) > > > + goto out_drc_error; > > > retval = nfsd_reply_cache_init(nn); > > > if (retval) > > > goto out_drc_error; > > > - nfsd4_init_leases_net(nn); > > > - > > > get_random_bytes(&nn->siphash_key, sizeof(nn->siphash_key)); > > > seqlock_init(&nn->writeverf_lock); > > > > > > @@ -1507,6 +1508,7 @@ static __net_exit void nfsd_exit_net(struct net *net) > > > nfsd_idmap_shutdown(net); > > > nfsd_export_shutdown(net); > > > nfsd_netns_free_versions(net_generic(net, nfsd_net_id)); > > > + nfsd4_leases_net_shutdown(nn); > > > } > > > > > > static struct pernet_operations nfsd_net_ops = { > > > diff --git a/fs/nfsd/nfsd.h b/fs/nfsd/nfsd.h > > > index 57a468ed85c3..7e05ab7a3532 100644 > > > --- a/fs/nfsd/nfsd.h > > > +++ b/fs/nfsd/nfsd.h > > > @@ -498,7 +498,11 @@ extern void unregister_cld_notifier(void); > > > extern void nfsd4_ssc_init_umount_work(struct nfsd_net *nn); > > > #endif > > > > > > -extern void nfsd4_init_leases_net(struct nfsd_net *nn); > > > +extern int nfsd4_init_leases_net(struct nfsd_net *nn); > > > +static inline void nfsd4_leases_net_shutdown(struct nfsd_net *nn) > > > +{ > > > + unregister_shrinker(&nn->nfsd_client_shrinker); > > > +}; > > > > > > #else /* CONFIG_NFSD_V4 */ > > > static inline int nfsd4_is_junction(struct dentry *dentry) > > > @@ -506,7 +510,8 @@ static inline int nfsd4_is_junction(struct dentry *dentry) > > > return 0; > > > } > > > > > > -static inline void nfsd4_init_leases_net(struct nfsd_net *nn) {}; > > > +static inline int nfsd4_init_leases_net(struct nfsd_net *nn) { return 0; }; > > > +static inline void nfsd4_leases_net_shutdown(struct nfsd_net *nn) { }; > > > > > > #define register_cld_notifier() 0 > > > #define unregister_cld_notifier() do { } while(0) -- Jeff Layton <jlayton@xxxxxxxxxx>