On Fri, Nov 15, 2019 at 08:11:50AM +1100, Dave Chinner wrote: > On Mon, Nov 04, 2019 at 10:29:54AM -0500, Brian Foster wrote: > > On Fri, Nov 01, 2019 at 10:46:02AM +1100, Dave Chinner wrote: > > > @@ -601,10 +605,10 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, > > > * scanning at high prio and therefore should try to reclaim as much as > > > * possible. > > > */ > > > - while (total_scan >= batch_size || > > > - total_scan >= freeable_objects) { > > > + while (scan_count >= batch_size || > > > + scan_count >= freeable_objects) { > > > unsigned long ret; > > > - unsigned long nr_to_scan = min(batch_size, total_scan); > > > + unsigned long nr_to_scan = min_t(long, batch_size, scan_count); > > > > > > shrinkctl->nr_to_scan = nr_to_scan; > > > shrinkctl->nr_scanned = nr_to_scan; > > > @@ -614,29 +618,29 @@ static unsigned long do_shrink_slab(struct shrink_control *shrinkctl, > > > freed += ret; > > > > > > count_vm_events(SLABS_SCANNED, shrinkctl->nr_scanned); > > > - total_scan -= shrinkctl->nr_scanned; > > > - scanned += shrinkctl->nr_scanned; > > > + scan_count -= shrinkctl->nr_scanned; > > > + scanned_objects += shrinkctl->nr_scanned; > > > > > > cond_resched(); > > > } > > > - > > > done: > > > - if (next_deferred >= scanned) > > > - next_deferred -= scanned; > > > + if (deferred_count) > > > + next_deferred = deferred_count - scanned_objects; > > > else > > > - next_deferred = 0; > > > + next_deferred = scan_count; > > > > Hmm.. so if there was no deferred count on this cycle, we set > > next_deferred to whatever is left from scan_count and add that back into > > the shrinker struct below. If there was a pending deferred count on this > > cycle, we subtract what we scanned from that and add that value back. > > But what happens to the remaining scan_count in the latter case? Is it > > lost, or am I missing something? > > if deferred_count is not zero, then it is kswapd that is running. It > does the deferred work, and if it doesn't make progress then adding > it's scan count to the deferred work doesn't matter. That's because > it will come back with an increased priority in a short while and > try to scan more of the deferred count plus it's larger scan count. > Ok, so perhaps there is no functional reason to defer remaining scan count from a context (i.e. kswapd) that attempts to process deferred work... > IOWs, if we defer kswapd unused scan count, we effectively increase > the pressure as the priority goes up, potentially making the > deferred count increase out of control. i.e. kswapd can make > progress and free items, but the result is that it increased the > deferred scan count rather than reducing it. This leads to excessive > reclaim of the slab caches and kswapd can trash the caches long > after the memory pressure has gone away... > ... yet if kswapd runs without pre-existing deferred work, that's precisely what it does. next_deferred is set to remaining scan_count and that is added back to the shrinker struct. So should kswapd generally defer work or not? If the answer is sometimes, then please add a comment to the next_deferred assignment to explain when/why. > > For example, suppose we start this cycle with a large scan_count and > > ->scan_objects() returned SHRINK_STOP before doing much work. In that > > scenario, it looks like whether ->nr_deferred is 0 or not is the only > > thing that determines whether we defer the entire remaining scan_count > > or just what is left from the previous ->nr_deferred. The existing code > > appears to consistently factor in what is left from the current scan > > with the previous deferred count. Hm? > > If kswapd doesn't have any deferred work, then it's largely no > different in behaviour to direct reclaim. If it has no deferred > work, then the shrinker is not getting stopped early in direct > reclaim, so it's unlikely that kswapd is going to get stopped early, > either.... > Then perhaps the logic could be simplified to explicitly not defer from kswapd..? Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >