On Fri, Jul 5, 2024 at 7:25 PM Takero Funaki <flintglass@xxxxxxxxx> wrote: > > To prevent the zswap global shrinker from writing back pages > simultaneously with IO performed for memory reclaim and faults, delay > the writeback when zswap_store() rejects pages or zswap_load() cannot > find entry in pool. > > When the zswap shrinker is running and zswap rejects an incoming page, > simulatenous zswap writeback and the rejected page lead to IO contention > on swap device. In this case, the writeback of the rejected page must be > higher priority as it is necessary for actual memory reclaim progress. > The zswap global shrinker can run in the background and should not > interfere with memory reclaim. Hmm what about this scenario: when we disable zswap writeback on a cgroup, if zswap_store() fails, we are delaying the global shrinker for no gain essentially. There is no subsequent IO. I don't think we are currently handling this, right? > > The same logic applies to zswap_load(). When zswap cannot find requested > page from pool and read IO is performed, shrinker should be interrupted. > Yet another (less concerning IMHO) scenario is when a cgroup disables zswap by setting zswap.max = 0 (for instance, if the sysadmin knows that this cgroup's data are really cold, and/or that the workload is latency-tolerant, and do not want it to take up valuable memory resources of other cgroups). Every time this cgroup reclaims memory, it would disable the global shrinker (including the new proactive behavior) for other cgroup, correct? And, when they do need to swap in, it would further delay the global shrinker. Would this break of isolation be a problem? There are other concerns I raised in the cover letter's response as well - please take a look :)