[RFC PATCH v1 13/13] mm: vmscan, swap, zswap: Compress batching of folios in shrink_folio_list().

Kanchana P Sridhar <kanchana.p.sridhar@xxxxxxxxx> · Thu, 17 Oct 2024 23:41:01 -0700

This patch enables the use of Intel IAA hardware compression acceleration
to reclaim a batch of folios in shrink_folio_list(). This results in
reclaim throughput and workload/sys performance improvements.

The earlier patches on compress batching deployed multiple IAA compress
engines for compressing up to SWAP_CRYPTO_SUB_BATCH_SIZE pages within a
large folio that is being stored in zswap_store(). This patch further
propagates the efficiency improvements demonstrated with IAA "batching
within folios", to vmscan "batching of folios" which will also use
batching within folios using the extensible architecture of
the __zswap_store_batch_core() procedure added earlier, that accepts
an array of folios.

A plug mechanism is introduced in swap_writepage() to aggregate a batch of
up to vm.compress-batchsize ([1, 32]) folios before processing the plug.
The plug will be processed if any of the following is true:

 1) The plug has vm.compress-batchsize folios. If the system has Intel IAA,
    "sysctl vm.compress-batchsize" can be configured to be in [1, 32]. On
    systems without IAA, or if CONFIG_ZSWAP_STORE_BATCHING_ENABLED is not
    set, "sysctl vm.compress-batchsize" can only be 1.
 2) A folio of a different swap type or folio_nid as the current folios in
    the plug, needs to be added to the plug.
 3) A pmd-mappable folio needs to be swapped out. In this case, the
    existing folios in the plug are processed. The pmd-mappable folio is
    swapped out (zswap_store() will batch compress
    SWAP_CRYPTO_SUB_BATCH_SIZE pages in the pmd-mappable folio if system
    has IAA) in a batch of its own.