Re: [PATCH v2] mm/vmscan: shrink slab in node reclaim

Michal Hocko <mhocko@xxxxxxxxxx> · Tue, 6 Aug 2019 11:05:16 +0200



On Tue 06-08-19 16:57:22, Yafang Shao wrote:
> On Tue, Aug 6, 2019 at 3:41 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >
> > On Tue 06-08-19 09:35:25, Michal Hocko wrote:
> > > On Tue 06-08-19 03:19:00, Yafang Shao wrote:
> > > > In the node reclaim, may_shrinkslab is 0 by default,
> > > > hence shrink_slab will never be performed in it.
> > > > While shrik_slab should be performed if the relcaimable slab is over
> > > > min slab limit.
> > > >
> > > > Add scan_control::no_pagecache so shrink_node can decide to reclaim page
> > > > cache, slab, or both as dictated by min_unmapped_pages and min_slab_pages.
> > > > shrink_node will do at least one of the two because otherwise node_reclaim
> > > > returns early.
> > > >
> > > > __node_reclaim can detect when enough slab has been reclaimed because
> > > > sc.reclaim_state.reclaimed_slab will tell us how many pages are
> > > > reclaimed in shrink slab.
> > > >
> > > > This issue is very easy to produce, first you continuously cat a random
> > > > non-exist file to produce more and more dentry, then you read big file
> > > > to produce page cache. And finally you will find that the denty will
> > > > never be shrunk in node reclaim (they can only be shrunk in kswapd until
> > > > the watermark is reached).
> > > >
> > > > Regarding vm.zone_reclaim_mode, we always set it to zero to disable node
> > > > reclaim. Someone may prefer to enable it if their different workloads work
> > > > on different nodes.
> > >
> > > Considering that this is a long term behavior of a rarely used node
> > > reclaim I would rather not touch it unless some _real_ workload suffers
> > > from this behavior. Or is there any reason to fix this even though there
> > > is no evidence of real workloads suffering from the current behavior?
> >
> > I have only now noticed that you have added
> > Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
> >
> > could you be more specific how that commit introduced a bug in the node
> > reclaim? It has introduced may_shrink_slab but the direct reclaim seems
> > to set it to 1.
> 
> As you said, the direct reclaim path set it to 1, but the
> __node_reclaim() forgot to process may_shrink_slab.

OK, I am blind obviously. Sorry about that. Anyway, why cannot we simply
get back to the original behavior by setting may_shrink_slab in that
path as well?
-- 
Michal Hocko
SUSE Labs