yan tried to enable the large folio for anonymous mapping [1]. Unlike large folio for page cache which doesn't trigger frequent page allocation/free, large folio for anonymous mapping is allocated/freeed more frequently. So large folio for anonymous mapping exposes some lock contention. Ryan mentioned the deferred queue lock in [1]. We also met other two lock contention: lru lock and zone lock. This series tries to mitigate the deferred queue lock and reduce lru lock in some level. The patch1 tries to reduce deferred queue lock by not acquiring queue lock when check whether the folio is in deferred list or not. Test page fault1 of will-it-scale showed 60% deferred queue lock contention reduction. The patch2 tries to reduce lru lock by allowing batched add large folio to lru list. Test page fault1 of will-it-scale showed 20% lru lock contention reduction. The zone lock contention happens on large folio free path and related with commit f26b3fa04611 "mm/page_alloc: limit number of high-order pages on PCP during bulk free" and will not be address by this series. [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@xxxxxxx/ Changelog from v2: - Rebased to v6.3-rc7 - Removed Tested-by: Ryan Roberts <ryan.roberts@xxxxxxx> as patches got some updated after Ryan tested them. - Updated the perf data change for deferred queue lock and lru lock with v3. - recheck whether folio is in deferred_list or not after take the deferred queue lock as Kirill suggested. Changelog from v1: For patch2: - Add Reported-by from Huang Ying which was missed by my mistake. - Fix kernel panic issue. The folio_batch_add() can have folio which doesn't reference folio directly: - For mlock usage, add new interface with extra parameter nr_pages. And callee pass nr_pages by direct reference folio. - For swap, shawdow and dax entries as parameter folio, treat the nr_pages as 1. With the fix, the stress testing can run 12 hours without any issue while hit kernel panic in around 3 minutes. - Update the lock contention info in commit message. - Change field name from pages_nr to nr_pages as Ying's suggestion. For this version, still use PAGEVEC_SIZE as max nr_pages in fbatch. We can revise it after we make decision about the page order for anonymous large folio. Yin Fengwei (2): THP: avoid lock when check whether THP is in deferred list lru: allow large batched add large folio to lru list include/linux/pagevec.h | 46 ++++++++++++++++++++++++++++++++++++++--- mm/huge_memory.c | 17 ++++++++++----- mm/mlock.c | 7 +++---- mm/swap.c | 3 +-- 4 files changed, 59 insertions(+), 14 deletions(-) -- 2.34.1