Pages to be added to a LRU are first cached in pagevec before being added to LRU in a batch via pagevec_lru_move_fn. By adding the pages in a batch with pagevec, the contention on LRU lock is mitigated. Currently the pagevec size is defined to be 15. We found during testing on a large SMT system with 2 sockets, 48 cores and 96 CPU threads, the pagevec size of 15 is too small for workload that caused frequent page additions to LRU. With pmbench, 8.9% of the CPU cycles are spent contending for the LRU lock. 12.92% pmbench [kernel.kallsyms] [k] queued_spin_lock_slowpath | --12.92%--0x5555555582f2 | --12.92%--page_fault do_page_fault __do_page_fault handle_mm_fault __handle_mm_fault | |--8.90%--__lru_cache_add | pagevec_lru_move_fn | | | --8.90%--_raw_spin_lock_irqsave | queued_spin_lock_slowpat Enlarge the pagevec size to 31 to reduce LRU lock contention for large systems. The LRU lock contention is reduced from 8.9% of total CPU cycles to 2.2% of CPU cyles. And the pmbench throughput increases from 88.8 Mpages/sec to 95.1 Mpages/sec. Signed-off-by: Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> --- include/linux/pagevec.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h index 081d934eda64..466ebcdd190d 100644 --- a/include/linux/pagevec.h +++ b/include/linux/pagevec.h @@ -11,8 +11,16 @@ #include <linux/xarray.h> +#if CONFIG_NR_CPUS > 64 +/* + * Use larger size to reduce lru lock contention on large system. + * 31 pointers + header align the pagevec structure to a power of two + */ +#define PAGEVEC_SIZE 31 +#else /* 15 pointers + header align the pagevec structure to a power of two */ #define PAGEVEC_SIZE 15 +#endif struct page; struct address_space; -- 2.20.1