On Thu, Nov 30, 2023 at 06:44:24PM +0000, Shakeel Butt wrote: > On Thu, Nov 30, 2023 at 07:36:53AM -0800, Dan Schatzberg wrote: > > (Sorry for the resend - forgot to cc the mailing lists) > > > > This patch proposes augmenting the memory.reclaim interface with a > > swappiness=<val> argument that overrides the swappiness value for that instance > > of proactive reclaim. > > > > Userspace proactive reclaimers use the memory.reclaim interface to trigger > > reclaim. The memory.reclaim interface does not allow for any way to effect the > > balance of file vs anon during proactive reclaim. The only approach is to adjust > > the vm.swappiness setting. However, there are a few reasons we look to control > > the balance of file vs anon during proactive reclaim, separately from reactive > > reclaim: > > > > * Swapout should be limited to manage SSD write endurance. In near-OOM > > Is this about swapout to SSD only? > > > situations we are fine with lots of swap-out to avoid OOMs. As these are > > typically rare events, they have relatively little impact on write endurance. > > However, proactive reclaim runs continuously and so its impact on SSD write > > endurance is more significant. Therefore it is desireable to control swap-out > > for proactive reclaim separately from reactive reclaim > > This is understandable but swapout to zswap should be fine, right? > (Sorry I am not following the discussion on zswap patches from Nhat. Is > the answer there?) Memory compression alone would be fine, yes. However, we don't use zswap in all cgroups. Lower priority things are forced directly to disk. Some workloads compress poorly and also go directly to disk for better memory efficiency. On such cgroups, it's important for proactive reclaim to manage swap rates to avoid burning out the flash. Note that zswap also does SSD writes during writeback. I know this doesn't apply to Google because of the ghost files, but we have SSD swapfiles behind zswap. And this part will become more relevant with Nhat's enhanced writeback patches.