Re: [PATCH RFC v2 0/2] mTHP-friendly compression in zsmalloc and zram based on multi-pages

Nhat Pham <nphamcs@xxxxxxxxx> · Mon, 11 Nov 2024 11:30:10 -0800

On Thu, Nov 7, 2024 at 2:10 AM Barry Song <21cnbao@xxxxxxxxx> wrote:
>
> From: Barry Song <v-songbaohua@xxxxxxxx>
>
> When large folios are compressed at a larger granularity, we observe
> a notable reduction in CPU usage and a significant improvement in
> compression ratios.
>
> mTHP's ability to be swapped out without splitting and swapped back in
> as a whole allows compression and decompression at larger granularities.
>
> This patchset enhances zsmalloc and zram by adding support for dividing
> large folios into multi-page blocks, typically configured with a
> 2-order granularity. Without this patchset, a large folio is always
> divided into `nr_pages` 4KiB blocks.
>
> The granularity can be set using the `ZSMALLOC_MULTI_PAGES_ORDER`
> setting, where the default of 2 allows all anonymous THP to benefit.
>
> Examples include:
> * A 16KiB large folio will be compressed and stored as a single 16KiB
>   block.
> * A 64KiB large folio will be compressed and stored as four 16KiB
>   blocks.
>
> For example, swapping out and swapping in 100MiB of typical anonymous
> data 100 times (with 16KB mTHP enabled) using zstd yields the following
> results:
>
>                         w/o patches        w/ patches
> swap-out time(ms)       68711              49908
> swap-in time(ms)        30687              20685
> compression ratio       20.49%             16.9%

The data looks very promising :) My understanding is it also results
in memory saving as well right? Since zstd operates better on bigger
inputs.

Is there any end-to-end benchmarking? My intuition is that this patch
series overall will improve the situations, assuming we don't fallback
to individual zero order page swapin too often, but it'd be nice if
there is some data backing this intuition (especially with the
upstream setup, i.e without any private patches). If the fallback
scenario happens frequently, the patch series can make a page fault
more expensive (since we have to decompress the entire chunk, and
discard everything but the single page being loaded in), so it might
make a difference.

Not super qualified to comment on zram changes otherwise - just a
casual observer to see if we can adopt this for zswap. zswap has the
added complexity of not supporting THP zswap in (until Usama's patch
series lands), and the presence of mixed backing states (due to zswap
writeback), increasing the likelihood of fallback :)