On Mon, Apr 15, 2024 at 8:12 PM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote: > > Both the file pages and anonymous pages support large folio, high-order > pages except PMD_ORDER will also be allocated frequently which could > increase the zone lock contention, allow high-order pages on pcp lists > could reduce the big zone lock contention, but as commit 44042b449872 > ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists") > pointed, it may not win in all the scenes, add a new control sysfs to > enable or disable specified high-order pages stored on PCP lists, the order > (PAGE_ALLOC_COSTLY_ORDER, PMD_ORDER) won't be stored on PCP list by default. This is precisely something Baolin and I have discussed and intended to implement[1], but unfortunately, we haven't had the time to do so. [1] https://lore.kernel.org/linux-mm/13c59ca8-baac-405e-8640-e693c78ef79a@xxxxxxx/T/#mecb0514ced830ac4df320113bedd7073bea9ab7a > > With perf lock tools, the lock contention from will-it-scale page_fault1 > (with 90 tasks run 10s, hugepage-2048KB never, hugepage-64K always) show > below(only care about zone spinlock and pcp spinlock), > > Without patches, > contended total wait max wait avg wait type caller > 713 4.64 ms 74.37 us 6.51 us spinlock __alloc_pages+0x23c > > With patches, > contended total wait max wait avg wait type caller > 2 25.66 us 16.31 us 12.83 us spinlock rmqueue_pcplist+0x2b0 > > Similar results on shell8 from unixbench, > > Without patches, > 4942 901.09 ms 1.31 ms 182.33 us spinlock __alloc_pages+0x23c > 1556 298.76 ms 1.23 ms 192.01 us spinlock rmqueue_pcplist+0x2b0 > 991 182.73 ms 879.80 us 184.39 us spinlock rmqueue_pcplist+0x2b0 > > With patches, > contended total wait max wait avg wait type caller > 988 187.63 ms 855.18 us 189.91 us spinlock rmqueue_pcplist+0x2b0 > 505 88.99 ms 793.27 us 176.21 us spinlock rmqueue_pcplist+0x2b0 > > The Benchmarks Score shows a little improvoment(0.28%) from shell8, but the > zone lock from __alloc_pages() disappeared. > > Kefeng Wang (3): > mm: prepare more high-order pages to be stored on the per-cpu lists > mm: add control to allow specified high-order pages stored on PCP list > mm: pcp: show each order page count > > Documentation/admin-guide/mm/transhuge.rst | 11 ++++ > include/linux/gfp.h | 1 + > include/linux/huge_mm.h | 1 + > include/linux/mmzone.h | 10 ++- > include/linux/vmstat.h | 19 ++++++ > mm/Kconfig.debug | 8 +++ > mm/huge_memory.c | 74 ++++++++++++++++++++++ > mm/page_alloc.c | 30 +++++++-- > mm/vmstat.c | 16 +++++ > 9 files changed, 164 insertions(+), 6 deletions(-) > > -- > 2.27.0 > >