On Wed, Oct 20, 2021 at 4:55 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Wed 20-10-21 15:33:39, Zhaoyang Huang wrote: > [...] > > Do you mean that direct reclaim should succeed for the first round > > reclaim within which memcg get protected by memory.low and would NOT > > retry by setting memcg_low_reclaim to true? > > Yes, this is the semantic of low limit protection in the upstream > kernel. Have a look at do_try_to_free_pages and how it sets > memcg_low_reclaim only if there were no pages reclaimed. > > > It is not true in android > > like system, where reclaim always failed and introduce lmk and even > > OOM. > > I am not familiar with android specific changes to the upstream reclaim > logic. You should be investigating why the reclaim couldn't make a > forward progress (aka reclaim pages) from non-protected memcgs. There > are tracepoints you can use (generally vmscan prefix). Ok, I am aware of why you get confused now. I think you are analysing cgroup's behaviour according to a pre-defined workload and memory pattern, which should work according to the design, such as processes within root should provide memory before protected memcg get reclaimed. You can refer [1] as the hierarchy, where effective userspace workloads locate in protect groups and have rest of processes be non-grouped. In fact, non-grouped ones can not provide enough memory as they are kernel threads and the processes with few pages on LRU(control logic inside). The practical scenario is groupA launched a high-order kmalloc and introduce reclaiming(kswapd and direct reclaim). As I said, non-grouped ones can not provide enough contiguous memory blocks which let direct reclaim quickly fail for the first round reclaiming. What I am trying to do is that let kswapd try more for the target. It is also fair if groupA,B,C are trapping in slow path concurrently. [1] root | | | | non-grouped processes groupA groupB groupC > > -- > Michal Hocko > SUSE Labs