v3: No (intended) functional change * Small cleanups, renamings, etc. (suggested by Yosry Ahmed) v2: * Add more details in comments, patch changelog, documentation, etc. about the second chance scheme and its ability to modulate the writeback rate (patch 1) (suggested by Yosry Ahmed). * Move the referenced bit (patch 1) (suggested by Yosry Ahmed). When experimenting with the memory-pressure based (i.e "dynamic") zswap shrinker in production, we observed a sharp increase in the number of swapins, which led to performance regression. We were able to trace this regression to the following problems with the shrinker's warm pages protection scheme: 1. The protection decays way too rapidly, and the decaying is coupled with zswap stores, leading to anomalous patterns, in which a small batch of zswap stores effectively erase all the protection in place for the warmer pages in the zswap LRU. This observation has also been corroborated upstream by Takero Funaki (in [1]). 2. We inaccurately track the number of swapped in pages, missing the non-pivot pages that are part of the readahead window, while counting the pages that are found in the zswap pool. To alleviate these two issues, this patch series improve the dynamic zswap shrinker in the following manner: 1. Replace the protection size tracking scheme with a second chance algorithm. This new scheme removes the need for haphazard stats decaying, and automatically adjusts the pace of pages aging with memory pressure, and writeback rate with pool activities: slowing down when the pool is dominated with zswpouts, and speeding up when the pool is dominated with stale entries. 2. Fix the tracking of the number of swapins to take into account non-pivot pages in the readahead window. With these two changes in place, in a kernel-building benchmark without any cold data added, the number of swapins is reduced by 64.12%. This translate to a 10.32% reduction in build time. We also observe a 3% reduction in kernel CPU time. In another benchmark, with cold data added (to gauge the new algorithm's ability to offload cold data), the new second chance scheme outperforms the old protection scheme by around 0.7%, and actually written back around 21% more pages to backing swap device. So the new scheme is just as good, if not even better than the old scheme on this front as well. [1]: https://lore.kernel.org/linux-mm/CAPpodddcGsK=0Xczfuk8usgZ47xeyf4ZjiofdT+ujiyz6V2pFQ@xxxxxxxxxxxxxx/ Nhat Pham (2): zswap: implement a second chance algorithm for dynamic zswap shrinker zswap: track swapins from disk more accurately include/linux/zswap.h | 16 +++---- mm/page_io.c | 11 ++++- mm/swap_state.c | 8 +--- mm/zswap.c | 108 ++++++++++++++++++++++++------------------ 4 files changed, 82 insertions(+), 61 deletions(-) base-commit: cca1345bd26a67fc61a92ff0c6d81766c259e522 -- 2.43.0