On Tue 04-09-18 08:34:49, Roman Gushchin wrote: > On Tue, Sep 04, 2018 at 09:00:05AM +0200, Michal Hocko wrote: > > On Mon 03-09-18 13:28:06, Roman Gushchin wrote: > > > On Mon, Sep 03, 2018 at 08:29:56PM +0200, Michal Hocko wrote: > > > > On Fri 31-08-18 14:31:41, Roman Gushchin wrote: > > > > > On Fri, Aug 31, 2018 at 05:15:39PM -0400, Rik van Riel wrote: > > > > > > On Fri, 2018-08-31 at 13:34 -0700, Roman Gushchin wrote: > > > > > > > > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > > > index fa2c150ab7b9..c910cf6bf606 100644 > > > > > > > --- a/mm/vmscan.c > > > > > > > +++ b/mm/vmscan.c > > > > > > > @@ -476,6 +476,10 @@ static unsigned long do_shrink_slab(struct > > > > > > > shrink_control *shrinkctl, > > > > > > > delta = freeable >> priority; > > > > > > > delta *= 4; > > > > > > > do_div(delta, shrinker->seeks); > > > > > > > + > > > > > > > + if (delta == 0 && freeable > 0) > > > > > > > + delta = min(freeable, batch_size); > > > > > > > + > > > > > > > total_scan += delta; > > > > > > > if (total_scan < 0) { > > > > > > > pr_err("shrink_slab: %pF negative objects to delete > > > > > > > nr=%ld\n", > > > > > > > > > > > > I agree that we need to shrink slabs with fewer than > > > > > > 4096 objects, but do we want to put more pressure on > > > > > > a slab the moment it drops below 4096 than we applied > > > > > > when it had just over 4096 objects on it? > > > > > > > > > > > > With this patch, a slab with 5000 objects on it will > > > > > > get 1 item scanned, while a slab with 4000 objects on > > > > > > it will see shrinker->batch or SHRINK_BATCH objects > > > > > > scanned every time. > > > > > > > > > > > > I don't know if this would cause any issues, just > > > > > > something to ponder. > > > > > > > > > > Hm, fair enough. So, basically we can always do > > > > > > > > > > delta = max(delta, min(freeable, batch_size)); > > > > > > > > > > Does it look better? > > > > > > > > Why don't you use the same heuristic we use for the normal LRU raclaim? > > > > > > Because we do reparent kmem lru lists on offlining. > > > Take a look at memcg_offline_kmem(). > > > > Then I must be missing something. Why are we growing the number of dead > > cgroups then? > > We do reparent LRU lists, but not objects. Objects (or, more precisely, pages) > are still holding a reference to the memcg. OK, this is what I missed. I thought that the reparenting includes all the pages as well. Is there any strong reason that we cannot do that? Performance/Locking/etc.? Or maybe do not reparent at all and rely on the same reclaim heuristic we do for normal pages? I am not opposing your patch but I am trying to figure out whether that is the best approach. -- Michal Hocko SUSE Labs