[Fixup Vladimir's email and drop the stable mailing list] On Wed 12-10-16 09:09:49, Shaohua Li wrote: > Our system uses significantly more slab memory with memcg enabled with > latest kernel. With 3.10 kernel, slab uses 2G memory, while with 4.6 > kernel, 6G memory is used. Looks the shrinker has problem. Let's see we > have two memcg for one shrinker. In do_shrink_slab: > > 1. Check cg1. nr_deferred = 0, assume total_scan = 700. batch size is 1024, > then no memory is freed. nr_deferred = 700 > 2. Check cg2. nr_deferred = 700. Assume freeable = 20, then total_scan = 10 > or 40. Let's assume it's 10. No memory is freed. nr_deferred = 10. > > The deferred share of cg1 is lost in this case. kswapd will free no > memory even run above steps again and again. > > The fix makes sure one memcg's deferred share isn't lost. > > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx (v4.0+) > Signed-off-by: Shaohua Li <shli@xxxxxx> > --- > mm/vmscan.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 0fe8b71..c3822ae 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -291,6 +291,7 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, > int nid = shrinkctl->nid; > long batch_size = shrinker->batch ? shrinker->batch > : SHRINK_BATCH; > + long scanned = 0, next_deferred; > > freeable = shrinker->count_objects(shrinker, shrinkctl); > if (freeable == 0) > @@ -312,7 +313,9 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, > pr_err("shrink_slab: %pF negative objects to delete nr=%ld\n", > shrinker->scan_objects, total_scan); > total_scan = freeable; > - } > + next_deferred = nr; > + } else > + next_deferred = total_scan; > > /* > * We need to avoid excessive windup on filesystem shrinkers > @@ -369,17 +372,22 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, > > count_vm_events(SLABS_SCANNED, nr_to_scan); > total_scan -= nr_to_scan; > + scanned += nr_to_scan; > > cond_resched(); > } > > + if (next_deferred >= scanned) > + next_deferred -= scanned; > + else > + next_deferred = 0; > /* > * move the unused scan count back into the shrinker in a > * manner that handles concurrent updates. If we exhausted the > * scan, there is no need to do an update. > */ > - if (total_scan > 0) > - new_nr = atomic_long_add_return(total_scan, > + if (next_deferred > 0) > + new_nr = atomic_long_add_return(next_deferred, > &shrinker->nr_deferred[nid]); > else > new_nr = atomic_long_read(&shrinker->nr_deferred[nid]); > -- > 2.9.3 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>