On 1/3/25 5:53 PM, Tejun Heo wrote:
On Sat, Jan 04, 2025 at 09:50:32AM +1100, NeilBrown wrote:
On Fri, 03 Jan 2025, cel@xxxxxxxxxx wrote:
...
I think that instead of passing "list_lru_count()" we should pass some
constant like 1024.
cnt = list_lru_count()
while (cnt > 0) {
num = min(cnt, 1024);
list_lru_walk(...., num);
cond_sched()
cnt -= num;
}
Then run it from system_wq.
list_lru_shrink is most often called as list_lru_shrink_walk() from a
shrinker, and the pattern there is essentially that above. A count is
taken, possibly scaled down, then the shrinker is called in batches.
BTW, there's nothing wrong with taking some msecs or even tens of msecs
running on system_unbound_wq, so the current state may be fine too.
My thinking was that this work is low priority, so there should be
plenty of opportunity to set it aside for a few moments and handle
higher priority work. Maybe not worrisome on systems with a high core
count, but on small SMP (eg VM guests), I've found that tasks like this
can be rude neighbors.
We could do this by adding a cond_resched() call site in the loop,
or take Neil's suggestion of breaking up the free list across multiple
work items that handle one or just a few file releases each.
--
Chuck Lever