Hi Matthew, On 4/30/2023 6:35 AM, Matthew Wilcox wrote: > On Sat, Apr 29, 2023 at 04:27:59PM +0800, Yin Fengwei wrote: >> @@ -22,6 +23,7 @@ struct address_space; >> struct pagevec { >> unsigned char nr; >> bool percpu_pvec_drained; >> + unsigned short nr_pages; > > I still don't like storing nr_pages in the pagevec/folio_batch. > What about the change like following: diff --git a/mm/swap.c b/mm/swap.c index 57cb01b042f6..5e7e9c0734ab 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -228,8 +228,10 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn) static void folio_batch_add_and_move(struct folio_batch *fbatch, struct folio *folio, move_fn_t move_fn) { - if (folio_batch_add(fbatch, folio) && !folio_test_large(folio) && - !lru_cache_disabled()) + int nr_pages = folio_nr_pages(folio); + + if (folio_batch_add(fbatch, folio) && !lru_cache_disabled() && + (!folio_test_large(folio) || (nr_pages <= (PAGEVEC_SIZE + 1)))) return; folio_batch_move_lru(fbatch, move_fn); } I did testing about the lru lock contention with different folio size with will-it-scale + deferred queue lock contention mitigated: - If large folio size is 16K (order 2), the lru lock takes 64.31% cpu runtime - If large folio size is 64K (order 4), the lru lock takes 24.24% cpu runtime This is as our expectation: The larger size of folio, the less lru lock contention. It's acceptable to not batched operate on large folio which is large enough. PAGEVEC_SIZE + 1 is chosen here based on following reasons: - acceptable max memory size per batch: 15 x 16 x 4096 = 983040 bytes - the folios with size larger than it will not apply batched operation. But the lru lock contention is not high already. I collected data with lru contention when run will-it-scale.page_fault1: folio with order 2: Without the change: - 64.31% 0.23% page_fault1_pro [kernel.kallsyms] [k] folio_lruvec_lock_irqsave + 64.07% folio_lruvec_lock_irqsave With the change: - 21.55% 0.21% page_fault1_pro [kernel.kallsyms] [k] folio_lruvec_lock_irqsave + 21.34% folio_lruvec_lock_irqsave folio with order 4: Without the change: - 24.24% 0.15% page_fault1_pro [kernel.kallsyms] [k] folio_lruvec_lock_irqsave + 24.09% folio_lruvec_lock_irqsave With the change: - 2.20% 0.09% page_fault1_pro [kernel.kallsyms] [k] folio_lruvec_lock_irqsave + 2.11% folio_lruvec_lock_irqsave folio with order 5: - 8.21% 0.16% page_fault1_pro [kernel.kallsyms] [k] folio_lruvec_lock_irqsave + 8.05% folio_lruvec_lock_irqsave Regards Yin, Fengwei