On Wed, Mar 20, 2024 at 2:51 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > On Wed, Mar 20, 2024 at 02:08:22AM +0000, Yosry Ahmed wrote: > > Currently, the number of protected zswap entries corresponding to an > > lruvec are incremented every time we swapin a page. > > Correct. This is the primary signal that the shrinker is being too > aggressive in moving entries to disk and should slow down...? Yup. Currently, there are two scenarios in which we increase zswap protection area: 1. zswap lru movement - for instance, if a page for some reasons is rotated in the LRU. When this happens, we increment the protection size, so that the page at the tail end of the protected area does not lose its protection because of (potentially) spurious LRU churnings. 2. swapin - when this happens, it is a signal that the shrinker is being too... enthusiastic in its reclaiming action. We should be more conservative, hence the increase in protection. I think there's some confusion around this, because we use the same-ish mechanism for two different events. Maybe I should have put yet another fat comment at the callsites of zswap_folio_swapin() too :) > > > This happens regardless of whether or not the page originated in > > zswap. Hence, swapins from disk will lead to increasing protection > > on potentially stale zswap entries. Furthermore, the increased Hmmm my original thinking was that, had we protected the swapped-in page back when it was still in zswap, we would have prevented this IO. And since the pages in zswap are (at least conceptually) "warmer" than the swapped in page, it is appropriate to increase the zswap protection. In fact, I was toying with the idea to max out the protection on swap-in to temporarily cease all zswap shrinking, but that is perhaps overkill :) > > shrinking protection can lead to more pages skipping zswap and going I think this can only happen when the protection is so strong that the zswap pool is full, right? In practice I have never seen this happen though. Did you observe it? We have a fairly aggressive lru-size-based protection decaying as well, so that should not happen... Also technically the protection only applies to the dynamic shrinker - the capacity-driven shrinking is not affected (although it's not very effective - perhaps we can rework this?) > > to disk, eventually leading to even more swapins from disk and > > starting a vicious circle. > > How does shrinker protection affect zswap stores? > > On the contrary, I would expect this patch to create a runaway > shrinker. The more aggressively it moves entries out to disk, the > lower the rate of zswap loads, the more aggressively it moves more > entries out to disk. > > > Instead, only increase the protection when pages are loaded from zswap. > > This also has a nice side effect of removing zswap_folio_swapin() and > > replacing it with a static helper that is only called from zswap_load(). > > > > No problems were observed in practice, this was found through code > > inspection. > > This is missing test results :) Yeah it'd be nice to see benchmark numbers :) I mean all this protection is heuristics anyway, so maybe I'm just being overly conservative. But only numbers can show this.