The patch titled Subject: mm: vmscan: shrink deferred objects proportional to priority has been added to the -mm tree. Its filename is mm-vmscan-shrink-deferred-objects-proportional-to-priority.patch This patch should soon appear at https://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-shrink-deferred-objects-proportional-to-priority.patch and later at https://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-shrink-deferred-objects-proportional-to-priority.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Yang Shi <shy828301@xxxxxxxxx> Subject: mm: vmscan: shrink deferred objects proportional to priority The number of deferred objects might get windup to an absurd number, and it results in clamp of slab objects. It is undesirable for sustaining workingset. So shrink deferred objects proportional to priority and cap nr_deferred to twice of cache items. The idea is borrowed from Dave Chinner's patch: https://lore.kernel.org/linux-xfs/20191031234618.15403-13-david@xxxxxxxxxxxxx/ Tested with kernel build and vfs metadata heavy workload in our production environment, no regression is spotted so far. Link: https://lkml.kernel.org/r/20210311190845.9708-14-shy828301@xxxxxxxxx Signed-off-by: Yang Shi <shy828301@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Roman Gushchin <guro@xxxxxx> Cc: Shakeel Butt <shakeelb@xxxxxxxxxx> Cc: Vlastimil Babka <vbabka@xxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/vmscan.c | 46 +++++++++++----------------------------------- 1 file changed, 11 insertions(+), 35 deletions(-) --- a/mm/vmscan.c~mm-vmscan-shrink-deferred-objects-proportional-to-priority +++ a/mm/vmscan.c @@ -664,7 +664,6 @@ static unsigned long do_shrink_slab(stru */ nr = xchg_nr_deferred(shrinker, shrinkctl); - total_scan = nr; if (shrinker->seeks) { delta = freeable >> priority; delta *= 4; @@ -678,37 +677,9 @@ static unsigned long do_shrink_slab(stru delta = freeable / 2; } + total_scan = nr >> priority; total_scan += delta; - if (total_scan < 0) { - pr_err("shrink_slab: %pS negative objects to delete nr=%ld\n", - shrinker->scan_objects, total_scan); - total_scan = freeable; - next_deferred = nr; - } else - next_deferred = total_scan; - - /* - * We need to avoid excessive windup on filesystem shrinkers - * due to large numbers of GFP_NOFS allocations causing the - * shrinkers to return -1 all the time. This results in a large - * nr being built up so when a shrink that can do some work - * comes along it empties the entire cache due to nr >>> - * freeable. This is bad for sustaining a working set in - * memory. - * - * Hence only allow the shrinker to scan the entire cache when - * a large delta change is calculated directly. - */ - if (delta < freeable / 4) - total_scan = min(total_scan, freeable / 2); - - /* - * Avoid risking looping forever due to too large nr value: - * never try to free more than twice the estimate number of - * freeable entries. - */ - if (total_scan > freeable * 2) - total_scan = freeable * 2; + total_scan = min(total_scan, (2 * freeable)); trace_mm_shrink_slab_start(shrinker, shrinkctl, nr, freeable, delta, total_scan, priority); @@ -747,10 +718,15 @@ static unsigned long do_shrink_slab(stru cond_resched(); } - if (next_deferred >= scanned) - next_deferred -= scanned; - else - next_deferred = 0; + /* + * The deferred work is increased by any new work (delta) that wasn't + * done, decreased by old deferred work that was done now. + * + * And it is capped to two times of the freeable items. + */ + next_deferred = max_t(long, (nr + delta - scanned), 0); + next_deferred = min(next_deferred, (2 * freeable)); + /* * move the unused scan count back into the shrinker in a * manner that handles concurrent updates. _ Patches currently in -mm which might be from shy828301@xxxxxxxxx are mm-vmscan-use-nid-from-shrink_control-for-tracepoint.patch mm-vmscan-consolidate-shrinker_maps-handling-code.patch mm-vmscan-use-shrinker_rwsem-to-protect-shrinker_maps-allocation.patch mm-vmscan-remove-memcg_shrinker_map_size.patch mm-vmscan-use-kvfree_rcu-instead-of-call_rcu.patch mm-memcontrol-rename-shrinker_map-to-shrinker_info.patch mm-vmscan-add-shrinker_info_protected-helper.patch mm-vmscan-use-a-new-flag-to-indicate-shrinker-is-registered.patch mm-vmscan-add-per-memcg-shrinker-nr_deferred.patch mm-vmscan-use-per-memcg-nr_deferred-of-shrinker.patch mm-vmscan-dont-need-allocate-shrinker-nr_deferred-for-memcg-aware-shrinkers.patch mm-memcontrol-reparent-nr_deferred-when-memcg-offline.patch mm-vmscan-shrink-deferred-objects-proportional-to-priority.patch