On Mon, Dec 18, 2023 at 11:21 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > > On Mon, Dec 18, 2023 at 6:44 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > > > On Fri, Dec 15, 2023 at 01:21:57PM -0800, Yosry Ahmed wrote: > > > On Thu, Dec 7, 2023 at 11:24 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > > > > > > > > During our experiment with zswap, we sometimes observe swap IOs due to > > > > occasional zswap store failures and writebacks-to-swap. These swapping > > > > IOs prevent many users who cannot tolerate swapping from adopting zswap > > > > to save memory and improve performance where possible. > > > > > > > > This patch adds the option to disable this behavior entirely: do not > > > > writeback to backing swapping device when a zswap store attempt fail, > > > > and do not write pages in the zswap pool back to the backing swap > > > > device (both when the pool is full, and when the new zswap shrinker is > > > > called). > > > > > > > > This new behavior can be opted-in/out on a per-cgroup basis via a new > > > > cgroup file. By default, writebacks to swap device is enabled, which is > > > > the previous behavior. Initially, writeback is enabled for the root > > > > cgroup, and a newly created cgroup will inherit the current setting of > > > > its parent. > > > > > > > > Note that this is subtly different from setting memory.swap.max to 0, as > > > > it still allows for pages to be stored in the zswap pool (which itself > > > > consumes swap space in its current form). > > > > > > > > This patch should be applied on top of the zswap shrinker series: > > > > > > > > https://lore.kernel.org/linux-mm/20231130194023.4102148-1-nphamcs@xxxxxxxxx/ > > > > > > > > as it also disables the zswap shrinker, a major source of zswap > > > > writebacks. > > > > > > > > Suggested-by: Johannes Weiner <hannes@xxxxxxxxxxx> > > > > Signed-off-by: Nhat Pham <nphamcs@xxxxxxxxx> > > > > Reviewed-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx> > > > > > > Taking a step back from all the memory.swap.tiers vs. > > > memory.zswap.writeback discussions, I think there may be a more > > > fundamental problem here. If the zswap store failure is recurrent, > > > pages can keep going back to the LRUs and then sent back to zswap > > > eventually, only to be rejected again. For example, this can if zswap > > > is above the acceptance threshold, but could be even worse if it's the > > > allocator rejecting the page due to not compressing well enough. In > > > the latter case, the page can keep going back and forth between zswap > > > and LRUs indefinitely. > > > > > > You probably did not run into this as you're using zsmalloc, but it > > Which is why I recommend everyone to use zsmalloc, and change the > default allocator to it in Kconfig :) > Internally, we have a cap on the compression ratio, after which we reject pages because it doesn't make sense to store them (e.g. zsmalloc will store them in a full page anyway, or the compressed size + metadata isn't worth it). I think this is where we should head upstream as well, you proposed something in the right direction with storing uncompressed pages in zswap. IOW, I think such pages should be taken out of the LRUs one way or another.