On Sun, Dec 15, 2024 at 5:54 PM hailong <hailong.liu@xxxxxxxx> wrote: > > On Fri, 13. Dec 09:06, T.J. Mercier wrote: > > On Thu, Dec 12, 2024 at 6:26 PM hailong <hailong.liu@xxxxxxxx> wrote: > > > > > > On Thu, 12. Dec 10:22, T.J. Mercier wrote: > > > > On Thu, Dec 12, 2024 at 1:57 AM hailong <hailong.liu@xxxxxxxx> wrote: > > > > > > > > > > From: Hailong Liu <hailong.liu@xxxxxxxx> > > > > > > > > > > commit a579086c99ed ("mm: multi-gen LRU: remove eviction fairness safeguard") said > > > > > Note that memcg LRU only applies to global reclaim. For memcg reclaim, > > > > > the eviction will continue, even if it is overshooting. This becomes > > > > > unconditional due to code simplification. > > > > > > > > > > Howeven, if we reclaim a root memcg by sysfs (memory.reclaim), the behavior acts > > > > > as a kswapd or direct reclaim. > > > > > > > > Hi Hailong, > > > > > > > > Why do you think this is a problem? > > > > > > > > > Fix this by remove the condition of mem_cgroup_is_root in > > > > > root_reclaim(). > > > > > Signed-off-by: Hailong Liu <hailong.liu@xxxxxxxx> > > > > > --- > > > > > mm/vmscan.c | 2 +- > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > > > > index 76378bc257e3..1f74f3ba0999 100644 > > > > > --- a/mm/vmscan.c > > > > > +++ b/mm/vmscan.c > > > > > @@ -216,7 +216,7 @@ static bool cgroup_reclaim(struct scan_control *sc) > > > > > */ > > > > > static bool root_reclaim(struct scan_control *sc) > > > > > { > > > > > - return !sc->target_mem_cgroup || mem_cgroup_is_root(sc->target_mem_cgroup); > > > > > + return !sc->target_mem_cgroup; > > > > > } > > > > > > > > > > /** > > > > > -- > > > > > Actually we switch to mglru on kernel-6.1 and see different behavior on > > > > > root_mem_cgroup reclaim. so is there any background fot this? > > > > > > > > Reclaim behavior differs with MGLRU. > > > > https://lore.kernel.org/lkml/20221201223923.873696-1-yuzhao@xxxxxxxxxx/ > > > > > > > > On even more recent kernels, regular LRU reclaim has also changed. > > > > https://lore.kernel.org/lkml/20240514202641.2821494-1-hannes@xxxxxxxxxxx/ > > > > > > Thanks for the details. > > > > > > Take this as a example. > > > root > > > / | \ > > > / | \ > > > a b c > > > | \ > > > | \ > > > d e > > > IIUC, the mglru can resolve the direct reclaim latency due to the > > > sharding. However, for the proactive reclaim, if we want to reclaim > > > b, b->d->e, however, if reclaiming the root, the reclaim path is > > > uncertain. The call stack is as follows: > > > lru_gen_shrink_node()->shrink_many()->hlist_nulls_for_each_entry_rcu()->shrink_one() > > > > > > So, for the proactive reclaim of root_memcg, whether it is mglru or > > > regular lru, calling shrink_node_memcgs() makes the behavior certain > > > and reasonable for me. > > > > The ordering is uncertain, but ordering has never been specified as > > part of that interface AFAIK, and you'll still get what you ask for (X > > bytes from the root or under). Assuming partial reclaim of a cgroup > > (which I hope is true if you're reclaiming from the root?) if I have > > the choice I'd rather have the memcg LRU ordering to try to reclaim > > from colder memcgs first, rather than a static pre-order traversal > > that always hits the same children first. > > > > The reason it's a choice only for the root is because the memcg LRU is > > maintained at the pgdat level, not at each individual cgroup. So there > > is no mechanism to get memcg LRU ordering from a subset of cgroups, > > which would be pretty cool but that sounds expensive. > > Got it, thanks for clarifying. From the perspective of memcg, it > behaves differently. But if we change the perspective to the global > reclaim, it is reasonable because root memcg is another way of global > reclaim. It makes global reclaim consistent. NACK myself :) Yeah, that's another way to look at it. :) > > > > - T.J. > > > > > Help you, Help me, > > > Hailong. > -- > Help you, Help me, > Hailong.