On Fri, Dec 2, 2022 at 1:38 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, 1 Dec 2022 15:33:17 -0800 Mina Almasry <almasrymina@xxxxxxxxxx> wrote: > > > Reclaiming directly from top tier nodes breaks the aging pipeline of > > memory tiers. If we have a RAM -> CXL -> storage hierarchy, we > > should demote from RAM to CXL and from CXL to storage. If we reclaim > > a page from RAM, it means we 'demote' it directly from RAM to storage, > > bypassing potentially a huge amount of pages colder than it in CXL. > > > > However disabling reclaim from top tier nodes entirely would cause ooms > > in edge scenarios where lower tier memory is unreclaimable for whatever > > reason, e.g. memory being mlocked() or too hot to reclaim. In these > > cases we would rather the job run with a performance regression rather > > than it oom altogether. > > > > However, we can disable reclaim from top tier nodes for proactive reclaim. > > That reclaim is not real memory pressure, and we don't have any cause to > > be breaking the aging pipeline. > > > > Is this purely from code inspection, or are there quantitative > observations to be shared? > This is from code inspection, but also it is by definition. Proactive reclaim is when the userspace does: echo "1m" > /path/to/cgroup/memory.reclaim At that point the kernel tries to proactively reclaim 1 MB from that cgroup at the userspace's behest, regardless of the actual memory pressure in the cgroup, so proactive reclaim is not real memory pressure as I state in the commit message. Proactive reclaim is triggered in the code by memory_reclaim(): https://elixir.bootlin.com/linux/v6.1-rc7/source/mm/memcontrol.c#L6572 Which sets MEMCG_RECLAIM_PROACTIVE: https://elixir.bootlin.com/linux/v6.1-rc7/source/mm/memcontrol.c#L6586 Which in turn sets sc->proactive: https://elixir.bootlin.com/linux/v6.1-rc7/source/mm/vmscan.c#L6743 In my patch I only allow falling back to reclaim from top tier nodes if !sc->proactive. I was in the process of sending a v2 with the comment fix btw, but I'll hold back on that since it seems you already merged the patch to unstable. Thanks! If I end up sending another version of the patch it should come with the comment fix.