On Mon, Jul 15, 2024 at 1:20 AM Takero Funaki <flintglass@xxxxxxxxx> wrote: > > 2024年7月13日(土) 8:02 Nhat Pham <nphamcs@xxxxxxxxx>: > > It was tested on an Azure VM with SSD-backed storage. The total IOPS > was capped at 4K IOPS by the VM host. The max throughput of the global > shrinker was around 16 MB/s. Proactive shrinking cannot prevent > pool_limit_hit since memory allocation can be on the order of GB/s. > (The benchmark script allocates 2 GB sequentially, which was > compressed to 1.3 GB, while the zswap pool was limited to 200 MB.) Hmmm I noticed that in a lot of other swap read/write paths (in __read_swap_cache_async(), or in shrink_lruvec()), we are doing block device plugging (blk_{start|finish}_plug()). The global shrinker path, however, is currently not doing this - it's triggered in a workqueue, separate from all these reclaim paths. I wonder if there are any values to doing the same for zswap global shrinker. We do acquire a mutex (which can sleep) for every page, which can unplug, but IIUC we only sleep when the mutex is currently held by another task, and the mutex is per-CPU. The compression algorithm is usually non-sleeping as well (for e.g, zstd). So maybe there could be improvement in throughput here? (Btw - friendly reminder that everyone should use zsmalloc as the default :)) Anyway, I haven't really played with this, and I don't have the right setup that mimics your use case. If you do decide to give this a shot, let me know :)