On 26 Jan 2024, at 4:03, Baolin Wang wrote: > On 1/23/2024 11:46 AM, Zi Yan wrote: >> From: Zi Yan <ziy@xxxxxxxxxx> >> >> Hi all, >> >> This patchset enables >0 order folio memory compaction, which is one of >> the prerequisitions for large folio support[1]. It is on top of >> mm-everything-2024-01-18-22-21. >> >> I am aware of that split free pages is necessary for folio >> migration in compaction, since if >0 order free pages are never split >> and no order-0 free page is scanned, compaction will end prematurely due >> to migration returns -ENOMEM. Free page split becomes a must instead of >> an optimization. >> >> Some applications from vm-scalability show different performance trends >> on default LRU and CONFIG_LRU_GEN from patch 1 (split folio during compaction), >> to patch 2 (folio migration during compaction), to patch 3 (folio >> migration during compaction with free page split). I am looking into it. >> >> lkp ncompare results (with >5% delta) for default LRU and CONFIG_LRU_GEN are >> shown at the bottom (on a 8-CPU (Intel Xeon E5-2650 v4 @ 2.20GHz) 16G VM). > > Overall, I haven't found any obvious issues, thanks for your work. However I got some percentage regression when running thpcompact on my machine(16 cores and 120G memory) without enabling mTHP: > k6.8-rc1 k6.8-rc1-patched > Percentage huge-1 86.19 ( 0.00%) 51.17 ( -40.63%) > Percentage huge-3 93.64 ( 0.00%) 42.48 ( -54.64%) > Percentage huge-5 94.93 ( 0.00%) 31.06 ( -67.28%) > Percentage huge-7 95.40 ( 0.00%) 19.09 ( -79.99%) > Percentage huge-12 93.51 ( 0.00%) 32.06 ( -65.71%) > Percentage huge-18 83.02 ( 0.00%) 54.58 ( -34.26%) > Percentage huge-24 83.17 ( 0.00%) 49.61 ( -40.35%) > Percentage huge-30 96.69 ( 0.00%) 59.82 ( -38.13%) > Percentage huge-32 95.52 ( 0.00%) 59.20 ( -38.03%) > > Ops Compaction stalls 229710.00 554846.00 > Ops Compaction success 144177.00 9351.00 > Ops Compaction failures 85533.00 545495.00 > Ops Compaction efficiency 62.76 1.69 > Ops Page migrate success 60333689.00 11687573.00 > Ops Page migrate failure 25818.00 459621.00 > Ops Compaction pages isolated 127723211.00 224420997.00 > Ops Compaction migrate scanned 142498744.00 173345194.00 > Ops Compaction free scanned 1159752360.00 624633726.00 > Ops Compact scan efficiency 12.29 27.75 > Ops Compaction cost 66050.96 17615.55 > > I did not have time to analyze this issue, just providing you some test information. And I will measure the compaction efficiency of mTHP if I find some time. Thanks a lot. These are useful numbers. I can see that the number of migration failures doubled and only ~1/5 of successes. I will use thpcompact to look into the issues. > >> From V1 [2]: >> 1. Used folio_test_large() instead of folio_order() > 0. (per Matthew >> Wilcox) >> >> 2. Fixed code rebase error. (per Baolin Wang) >> >> 3. Used list_split_init() instead of list_split(). (per Ryan Boberts) >> >> 4. Added free_pages_prepare_fpi_none() to avoid duplicate free page code >> in compaction_free(). >> >> 5. Dropped source page order sorting patch. >> >> From RFC [1]: >> 1. Enabled >0 order folio compaction in the first patch by splitting all >> to-be-migrated folios. (per Huang, Ying) >> >> 2. Stopped isolating compound pages with order greater than cc->order >> to avoid wasting effort, since cc->order gives a hint that no free pages >> with order greater than it exist, thus migrating the compound pages will fail. >> (per Baolin Wang) >> >> 3. Retained the folio check within lru lock. (per Baolin Wang) >> >> 4. Made isolate_freepages_block() generate order-sorted multi lists. >> (per Johannes Weiner) >> >> Overview >> === >> >> To support >0 order folio compaction, the patchset changes how free pages used >> for migration are kept during compaction. Free pages used to be split into >> order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared, >> page order stored in page->private is zeroed, and page reference is set to 1). >> Now all free pages are kept in a MAX_ORDER+1 array of page lists based >> on their order without post allocation process. When migrate_pages() asks for >> a new page, one of the free pages, based on the requested page order, is >> then processed and given out. >> >> >> Feel free to give comments and ask questions. >> >> Thanks. >> >> [1] https://lore.kernel.org/linux-mm/20230912162815.440749-1-zi.yan@xxxxxxxx/ >> [2] https://lore.kernel.org/linux-mm/20231113170157.280181-1-zi.yan@xxxxxxxx/ >> >> vm-scalability results on CONFIG_LRU_GEN >> === >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/small-allocs/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 2024326 +35.5% 2743772 ± 41% +364.0% 9392198 ± 35% +31.0% 2651634 vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/small-allocs-mt/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 1450189 +0.9% 1463418 +30.4% 1891610 ± 22% +0.3% 1454100 vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/mmap-xread-seq-mt/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 14428848 ± 27% -51.7% 6963308 ± 73% +13.5% 16372621 +11.2% 16046511 vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 13569502 ± 24% -45.9% 7340064 ± 59% +12.3% 15240531 +10.4% 14983705 vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq-mt/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 13305823 ± 24% -45.1% 7299664 ± 56% +12.5% 14974725 +10.4% 14695963 vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 13244376 ± 28% +54.2% 20425838 ± 23% -4.4% 12660113 ± 3% -9.0% 12045809 ± 3% vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 7021425 ± 11% -20.9% 5556751 ± 19% +14.8% 8057811 ± 3% +9.4% 7678613 ± 4% vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/256G/qemu-vm/msync/vm-scalability >> >> commit: >> 6.7.0-rc4+ >> 6.7.0-rc4-split-folio-in-compaction+ >> 6.7.0-rc4-folio-migration-in-compaction+ >> 6.7.0-rc4-folio-migration-free-page-split+ >> >> 6.7.0-rc4+ 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 1208994 ±137% +263.5% 4394683 ± 49% -49.4% 611204 ± 6% -48.1% 627937 ± 13% vm-scalability.throughput >> >> >> >> vm-scalability results on default LRU (with -no-mglru suffix) >> === >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability >> >> commit: >> 6.7.0-rc4-no-mglru+ >> 6.7.0-rc4-split-folio-in-compaction-no-mglru+ >> 6.7.0-rc4-folio-migration-in-compaction-no-mglru+ >> 6.7.0-rc4-folio-migration-free-page-split-no-mglru+ >> >> 6.7.0-rc4-no-mgl 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 8412072 ± 3% +32.1% 11114537 ± 41% +3.5% 8703491 ± 3% +1.5% 8536343 ± 3% vm-scalability.throughput >> >> ========================================================================================= >> compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: >> gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability >> >> commit: >> 6.7.0-rc4-no-mglru+ >> 6.7.0-rc4-split-folio-in-compaction-no-mglru+ >> 6.7.0-rc4-folio-migration-in-compaction-no-mglru+ >> 6.7.0-rc4-folio-migration-free-page-split-no-mglru+ >> >> 6.7.0-rc4-no-mgl 6.7.0-rc4-split-folio-in-co 6.7.0-rc4-folio-migration-i 6.7.0-rc4-folio-migration-f >> ---------------- --------------------------- --------------------------- --------------------------- >> %stddev %change %stddev %change %stddev %change %stddev >> \ | \ | \ | \ >> 7095358 +10.8% 7863635 ± 16% +5.5% 7484110 +1.5% 7200666 ± 4% vm-scalability.throughput >> >> >> Zi Yan (3): >> mm/compaction: enable compacting >0 order folios. >> mm/compaction: add support for >0 order folio memory compaction. >> mm/compaction: optimize >0 order folio compaction with free page >> split. >> >> mm/compaction.c | 218 ++++++++++++++++++++++++++++++++++-------------- >> mm/internal.h | 9 +- >> mm/page_alloc.c | 6 ++ >> 3 files changed, 169 insertions(+), 64 deletions(-) >> -- Best Regards, Yan, Zi
Attachment:
signature.asc
Description: OpenPGP digital signature