Re: [PATCH rfc 0/3] mm: allow more high-order pages stored on PCP lists

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/4/15 16:18, Barry Song wrote:
On Mon, Apr 15, 2024 at 8:12 PM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:

Both the file pages and anonymous pages support large folio, high-order
pages except PMD_ORDER will also be allocated frequently which could
increase the zone lock contention, allow high-order pages on pcp lists
could reduce the big zone lock contention, but as commit 44042b449872
("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists")
pointed, it may not win in all the scenes, add a new control sysfs to
enable or disable specified high-order pages stored on PCP lists, the order
(PAGE_ALLOC_COSTLY_ORDER, PMD_ORDER) won't be stored on PCP list by default.

This is precisely something Baolin and I have discussed and intended
to implement[1],
but unfortunately, we haven't had the time to do so.

Indeed, same thing. Recently, we are working on unixbench/lmbench
optimization, I tested Multi-size THP for anonymous memory by hard-cord
PAGE_ALLOC_COSTLY_ORDER from 3 to 4[1], it shows some improvement but
not for all cases and not very stable, so re-implemented it by according
to the user requirement and enable it dynamically.

[1] https://lore.kernel.org/linux-mm/b8f5a47a-af1e-44ed-a89b-460d0be56d2c@xxxxxxxxxx/


[1] https://lore.kernel.org/linux-mm/13c59ca8-baac-405e-8640-e693c78ef79a@xxxxxxx/T/#mecb0514ced830ac4df320113bedd7073bea9ab7a


With perf lock tools, the lock contention from will-it-scale page_fault1
(with 90 tasks run 10s, hugepage-2048KB never, hugepage-64K always) show
below(only care about zone spinlock and pcp spinlock),

Without patches,
  contended   total wait     max wait     avg wait         type   caller
        713      4.64 ms     74.37 us      6.51 us     spinlock   __alloc_pages+0x23c

With patches,
  contended   total wait     max wait     avg wait         type   caller
          2     25.66 us     16.31 us     12.83 us     spinlock   rmqueue_pcplist+0x2b0

Similar results on shell8 from unixbench,

Without patches,
       4942    901.09 ms      1.31 ms    182.33 us     spinlock   __alloc_pages+0x23c
       1556    298.76 ms      1.23 ms    192.01 us     spinlock   rmqueue_pcplist+0x2b0
        991    182.73 ms    879.80 us    184.39 us     spinlock   rmqueue_pcplist+0x2b0

With patches,
contended   total wait     max wait     avg wait         type   caller
        988    187.63 ms    855.18 us    189.91 us     spinlock   rmqueue_pcplist+0x2b0
        505     88.99 ms    793.27 us    176.21 us     spinlock   rmqueue_pcplist+0x2b0

The Benchmarks Score shows a little improvoment(0.28%) from shell8, but the
zone lock from __alloc_pages() disappeared.

Kefeng Wang (3):
   mm: prepare more high-order pages to be stored on the per-cpu lists
   mm: add control to allow specified high-order pages stored on PCP list
   mm: pcp: show each order page count

  Documentation/admin-guide/mm/transhuge.rst | 11 ++++
  include/linux/gfp.h                        |  1 +
  include/linux/huge_mm.h                    |  1 +
  include/linux/mmzone.h                     | 10 ++-
  include/linux/vmstat.h                     | 19 ++++++
  mm/Kconfig.debug                           |  8 +++
  mm/huge_memory.c                           | 74 ++++++++++++++++++++++
  mm/page_alloc.c                            | 30 +++++++--
  mm/vmstat.c                                | 16 +++++
  9 files changed, 164 insertions(+), 6 deletions(-)

--
2.27.0







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux