From: Barry Song <v-songbaohua@xxxxxxxx> When large folios are compressed at a larger granularity, we observe a notable reduction in CPU usage and a significant improvement in compression ratios. mTHP's ability to be swapped out without splitting and swapped back in as a whole allows compression and decompression at larger granularities. This patchset enhances zsmalloc and zram by adding support for dividing large folios into multi-page blocks, typically configured with a 2-order granularity. Without this patchset, a large folio is always divided into `nr_pages` 4KiB blocks. The granularity can be set using the `ZSMALLOC_MULTI_PAGES_ORDER` setting, where the default of 2 allows all anonymous THP to benefit. Examples include: * A 16KiB large folio will be compressed and stored as a single 16KiB block. * A 64KiB large folio will be compressed and stored as four 16KiB blocks. For example, swapping out and swapping in 100MiB of typical anonymous data 100 times (with 16KB mTHP enabled) using zstd yields the following results: w/o patches w/ patches swap-out time(ms) 68711 49908 swap-in time(ms) 30687 20685 compression ratio 20.49% 16.9% -v2: While it is not mature yet, I know some people are waiting for an update :-) * Fixed some stability issues. * rebase againest the latest mm-unstable. * Set default order to 2 which benefits all anon mTHP. * multipages ZsPageMovable is not supported yet. Tangquan Zheng (2): mm: zsmalloc: support objects compressed based on multiple pages zram: support compression at the granularity of multi-pages drivers/block/zram/Kconfig | 9 + drivers/block/zram/zcomp.c | 17 +- drivers/block/zram/zcomp.h | 12 +- drivers/block/zram/zram_drv.c | 450 +++++++++++++++++++++++++++++++--- drivers/block/zram/zram_drv.h | 45 ++++ include/linux/zsmalloc.h | 10 +- mm/Kconfig | 18 ++ mm/zsmalloc.c | 232 +++++++++++++----- 8 files changed, 699 insertions(+), 94 deletions(-) -- 2.39.3 (Apple Git-146)