Re: [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while pagein/out IO

Nhat Pham <nphamcs@xxxxxxxxx> · Mon, 8 Jul 2024 17:57:39 -0700

On Fri, Jul 5, 2024 at 7:25 PM Takero Funaki <flintglass@xxxxxxxxx> wrote:
>
> To prevent the zswap global shrinker from writing back pages
> simultaneously with IO performed for memory reclaim and faults, delay
> the writeback when zswap_store() rejects pages or zswap_load() cannot
> find entry in pool.
>
> When the zswap shrinker is running and zswap rejects an incoming page,
> simulatenous zswap writeback and the rejected page lead to IO contention
> on swap device. In this case, the writeback of the rejected page must be
> higher priority as it is necessary for actual memory reclaim progress.
> The zswap global shrinker can run in the background and should not
> interfere with memory reclaim.

Hmm what about this scenario: when we disable zswap writeback on a
cgroup, if zswap_store() fails, we are delaying the global shrinker
for no gain essentially. There is no subsequent IO. I don't think we
are currently handling this, right?

>
> The same logic applies to zswap_load(). When zswap cannot find requested
> page from pool and read IO is performed, shrinker should be interrupted.
>

Yet another (less concerning IMHO) scenario is when a cgroup disables
zswap by setting zswap.max = 0 (for instance, if the sysadmin knows
that this cgroup's data are really cold, and/or that the workload is
latency-tolerant, and do not want it to take up valuable memory
resources of other cgroups). Every time this cgroup reclaims memory,
it would disable the global shrinker (including the new proactive
behavior) for other cgroup, correct? And, when they do need to swap
in, it would further delay the global shrinker. Would this break of
isolation be a problem?

There are other concerns I raised in the cover letter's response as
well - please take a look :)