Re: [PATCH] memcg: introduce per-memcg reclaim interface

Shakeel Butt <shakeelb@xxxxxxxxxx> · Tue, 6 Oct 2020 09:55:43 -0700

On Thu, Oct 1, 2020 at 7:33 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>
[snip]
> > >    So instead of asking users for a target size whose suitability
> > >    heavily depends on the kernel's LRU implementation, the readahead
> > >    code, the IO device's capability and general load, why not directly
> > >    ask the user for a pressure level that the workload is comfortable
> > >    with and which captures all of the above factors implicitly? Then
> > >    let the kernel do this feedback loop from a per-cgroup worker.
> >
> > I am assuming here by pressure level you are referring to the PSI like
> > interface e.g. allowing the users to tell about their jobs that X
> > amount of stalls in a fixed time window is tolerable.
>
> Right, essentially the same parameters that psi poll() would take.

I thought a bit more on the semantics of the psi usage for the
proactive reclaim.

Suppose I have a top level cgroup A on which I want to enable
proactive reclaim. Which memory psi events should the proactive
reclaim should consider?

The simplest would be the memory.psi at 'A'. However memory.psi is
hierarchical and I would not really want the pressure due limits in
children of 'A' to impact the proactive reclaim. PSI due to refaults
and slow IO should be included or maybe only those which are caused by
the proactive reclaim itself. I am undecided on the PSI due to
compaction. PSI due to global reclaim for 'A' is even more
complicated. This is a stall due to reclaiming from the system
including self. It might not really cause more refaults and IOs for
'A'. Should proactive reclaim ignore the pressure due to global
pressure when tuning its aggressiveness.

Am I overthinking here?