Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink

Nhat Pham <nphamcs@xxxxxxxxx> · Fri, 26 Jul 2024 11:13:21 -0700

On Mon, Jul 15, 2024 at 1:20 AM Takero Funaki <flintglass@xxxxxxxxx> wrote:
>
> 2024年7月13日(土) 8:02 Nhat Pham <nphamcs@xxxxxxxxx>:
>
> It was tested on an Azure VM with SSD-backed storage. The total IOPS
> was capped at 4K IOPS by the VM host. The max throughput of the global
> shrinker was around 16 MB/s. Proactive shrinking cannot prevent
> pool_limit_hit since memory allocation can be on the order of GB/s.
> (The benchmark script allocates 2 GB sequentially, which was
> compressed to 1.3 GB, while the zswap pool was limited to 200 MB.)

Hmmm I noticed that in a lot of other swap read/write paths (in
__read_swap_cache_async(), or in shrink_lruvec()), we are doing block
device plugging (blk_{start|finish}_plug()). The global shrinker path,
however, is currently not doing this - it's triggered in a workqueue,
separate from all these reclaim paths.

I wonder if there are any values to doing the same for zswap global
shrinker. We do acquire a mutex (which can sleep) for every page,
which can unplug, but IIUC we only sleep when the mutex is currently
held by another task, and the mutex is per-CPU. The compression
algorithm is usually non-sleeping as well (for e.g, zstd). So maybe
there could be improvement in throughput here?

(Btw - friendly reminder that everyone should use zsmalloc as the default :))

Anyway, I haven't really played with this, and I don't have the right
setup that mimics your use case. If you do decide to give this a shot,
let me know :)