2024年7月11日(木) 7:10 Nhat Pham <nphamcs@xxxxxxxxx>: > > > Yet another (less concerning IMHO) scenario is when a cgroup disables > > > zswap by setting zswap.max = 0 (for instance, if the sysadmin knows > > > that this cgroup's data are really cold, and/or that the workload is > > > latency-tolerant, and do not want it to take up valuable memory > > > resources of other cgroups). Every time this cgroup reclaims memory, > > > it would disable the global shrinker (including the new proactive > > > behavior) for other cgroup, correct? And, when they do need to swap > > > in, it would further delay the global shrinker. Would this break of > > > isolation be a problem? > > > > > > There are other concerns I raised in the cover letter's response as > > > well - please take a look :) > > > > I haven't considered these cases much, but I suppose the global > > shrinker should be delayed in both cases as well. In general, any > > pagein/out should be prefered over shrinker writeback throughput. > > > > When zswap writeback was disabled for a memcg > > (memcg.zswap.writeback=0), I suppose disabling/delaying writeback is > > harmless. > > If the rejection incurs no IO, there is no more memory pressure and > > shrinking is not urgent. We can postpone the shrinker writeback. If > > the rejection incurs IO (i.e. mm choose another page from a memcg with > > writeback enabled), again we should delay the shrinker. > > You are delaying writeback globally right? IOW, other cgroup is also affected? > > Say we are under memory pressure, with two cgroups under reclaim - one > with zswap writeback disabled. The one with writeback disabled will > constantly fail at zswap_store(), and delay the global shrinking > action, which could have targeted the other cgroup (which should > proceed fully because there is no contention here since the first > cgroup is not swapping either). > Thanks, I think I understand your concern. Even if zswap rejected pages, that does not mean we need IO because memory.zswap.writeback=0 also disables memory-to-disk writeback... And yes, v2 interrupts the shrinker in this case, which is unnecessary. I'll move the timer updates to page_io.c like this: --- a/mm/page_io.c +++ b/mm/page_io.c @@ -205,6 +205,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) folio_mark_dirty(folio); return AOP_WRITEPAGE_ACTIVATE; } + zswap_shrinker_delay_extend(); __swap_writepage(folio, wbc); return 0; This extends the delay only if actual I/O is necessary.