On 6/26/20 8:47 PM, Andrew Morton wrote: > On Sat, 27 Jun 2020 04:13:04 +0100 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > >> On Fri, Jun 26, 2020 at 02:23:03PM -0700, Tim Chen wrote: >>> Enlarge the pagevec size to 31 to reduce LRU lock contention for >>> large systems. >>> >>> The LRU lock contention is reduced from 8.9% of total CPU cycles >>> to 2.2% of CPU cyles. And the pmbench throughput increases >>> from 88.8 Mpages/sec to 95.1 Mpages/sec. >> >> The downside here is that pagevecs are often stored on the stack (eg >> truncate_inode_pages_range()) as well as being used for the LRU list. >> On a 64-bit system, this increases the stack usage from 128 to 256 bytes >> for this array. >> >> I wonder if we could do something where we transform the ones on the >> stack to DECLARE_STACK_PAGEVEC(pvec), and similarly DECLARE_LRU_PAGEVEC >> the ones used for the LRUs. There's plenty of space in the header to >> add an unsigned char sz, delete PAGEVEC_SIZE and make it an variable >> length struct. >> >> Or maybe our stacks are now big enough that we just don't care. >> What do you think? > > And I wonder how useful CONFIG_NR_CPUS is for making this decision. > Presumably a lot of general-purpose kernel builds have CONFIG_NR_CPUS > much larger than the actual number of CPUs. > > I can't think of much of a fix for this, apart from making it larger on > all kernels, Is there a downside to this? > Thanks for Matthew and Andrew's feedbacks. I am okay with Matthew's suggestion of keeping the stack pagevec size unchanged. Andrew, do you have a preference? I was assuming that for people who really care about saving the kernel memory usage, they would make CONFIG_NR_CPUS small. I also have a hard time coming up with a better scheme. Otherwise, we will have to adjust the pagevec size when we actually found out how many CPUs we have brought online. It seems like a lot of added complexity for going that route. Tim