On Tue 30-06-20 17:27:13, Andrew Morton wrote: > On Mon, 29 Jun 2020 09:57:42 -0700 Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote: > > I am okay with Matthew's suggestion of keeping the stack pagevec size unchanged. > > Andrew, do you have a preference? > > > > I was assuming that for people who really care about saving the kernel memory > > usage, they would make CONFIG_NR_CPUS small. I also have a hard time coming > > up with a better scheme. > > > > Otherwise, we will have to adjust the pagevec size when we actually > > found out how many CPUs we have brought online. It seems like a lot > > of added complexity for going that route. > > Even if we were to do this, the worst-case stack usage on the largest > systems might be an issue. If it isn't then we might as well hard-wire > it to 31 elements anyway, I am not sure this is really a matter of how large the machine is. For example in the writeout paths this really depends on how complex the IO stack is much more. Direct memory reclaim is also a very sensitive stack context. As we are not doing any writeout anymore I believe a large part of the on stack fs usage is not really relevant. There seem to be only few on stack users inside mm and they shouldn't be part of the memory reclaim AFAICS. I have simply did $ git grep "^[[:space:]]*struct pagevec[[:space:]][^*]" and fortunately there weren't that many hits to get an idea about the usage. There is some usage in the graphic stack that should be double check though. Btw. I think that pvec is likely a suboptimal data structure for many on stack users. It allows only very few slots to batch. Something like mmu_gather which can optimistically increase the batch sounds like something that would be worth The main question is whether the improvement is visible on any non-artificial workloads. If yes then the quick fix is likely the best way forward. If this is mostly a microbench thingy then I would be happier to see a more longterm solution. E.g. scale pcp pagevec sizes on the machine size or even use something better than pvec (e.g. lru_deactivate_file could scale much more and I am not sure pcp aspect is really improving anything - why don't we simply invalidate all gathered pages at once at the end of invalidate_mapping_pages?). -- Michal Hocko SUSE Labs