Re: [PATCH RFC v2 0/2] mTHP-friendly compression in zsmalloc and zram based on multi-pages

Barry Song <21cnbao@xxxxxxxxx> · Fri, 8 Nov 2024 19:51:12 +1300

On Fri, Nov 8, 2024 at 6:23 PM Huang, Ying <ying.huang@xxxxxxxxx> wrote:
>
> Hi, Barry,
>
> Barry Song <21cnbao@xxxxxxxxx> writes:
>
> > From: Barry Song <v-songbaohua@xxxxxxxx>
> >
> > When large folios are compressed at a larger granularity, we observe
> > a notable reduction in CPU usage and a significant improvement in
> > compression ratios.
> >
> > mTHP's ability to be swapped out without splitting and swapped back in
> > as a whole allows compression and decompression at larger granularities.
> >
> > This patchset enhances zsmalloc and zram by adding support for dividing
> > large folios into multi-page blocks, typically configured with a
> > 2-order granularity. Without this patchset, a large folio is always
> > divided into `nr_pages` 4KiB blocks.
> >
> > The granularity can be set using the `ZSMALLOC_MULTI_PAGES_ORDER`
> > setting, where the default of 2 allows all anonymous THP to benefit.
> >
> > Examples include:
> > * A 16KiB large folio will be compressed and stored as a single 16KiB
> >   block.
> > * A 64KiB large folio will be compressed and stored as four 16KiB
> >   blocks.
> >
> > For example, swapping out and swapping in 100MiB of typical anonymous
> > data 100 times (with 16KB mTHP enabled) using zstd yields the following
> > results:
> >
> >                         w/o patches        w/ patches
> > swap-out time(ms)       68711              49908
> > swap-in time(ms)        30687              20685
> > compression ratio       20.49%             16.9%
>
> The data looks good.  Thanks!
>
> Have you considered the situation that the large folio fails to be
> allocated during swap-in?  It's possible because the memory may be very
> fragmented.

That's correct, good question. On phones, we use a large folio pool to maintain
a relatively high allocation success rate. When mTHP allocation fails, we have
a workaround to allocate nr_pages of small folios and map them together to
avoid partial reads.  This ensures that the benefits of larger block compression
and decompression are consistently maintained.  That was the code running
on production phones.

We also previously experimented with maintaining multiple buffers for
decompressed
large blocks in zRAM, allowing upcoming do_swap_page() calls to use them when
falling back to small folios. In this setup, the buffers achieved a
high hit rate, though
I don’t recall the exact number.

I'm concerned that this fault-around-like fallback to nr_pages small
folios may not
gain traction upstream. Do you have any suggestions for improvement?

>
> > -v2:
> >  While it is not mature yet, I know some people are waiting for
> >  an update :-)
> >  * Fixed some stability issues.
> >  * rebase againest the latest mm-unstable.
> >  * Set default order to 2 which benefits all anon mTHP.
> >  * multipages ZsPageMovable is not supported yet.
> >
> > Tangquan Zheng (2):
> >   mm: zsmalloc: support objects compressed based on multiple pages
> >   zram: support compression at the granularity of multi-pages
> >
> >  drivers/block/zram/Kconfig    |   9 +
> >  drivers/block/zram/zcomp.c    |  17 +-
> >  drivers/block/zram/zcomp.h    |  12 +-
> >  drivers/block/zram/zram_drv.c | 450 +++++++++++++++++++++++++++++++---
> >  drivers/block/zram/zram_drv.h |  45 ++++
> >  include/linux/zsmalloc.h      |  10 +-
> >  mm/Kconfig                    |  18 ++
> >  mm/zsmalloc.c                 | 232 +++++++++++++-----
> >  8 files changed, 699 insertions(+), 94 deletions(-)
>
> --
> Best Regards,
> Huang, Ying

Thanks
barry