On Tue 04-01-22 13:22:25, Yu Zhao wrote: [...] > +static void walk_mm(struct lruvec *lruvec, struct mm_struct *mm, struct lru_gen_mm_walk *walk) > +{ > + static const struct mm_walk_ops mm_walk_ops = { > + .test_walk = should_skip_vma, > + .p4d_entry = walk_pud_range, > + }; > + > + int err; > +#ifdef CONFIG_MEMCG > + struct mem_cgroup *memcg = lruvec_memcg(lruvec); > +#endif > + > + walk->next_addr = FIRST_USER_ADDRESS; > + > + do { > + unsigned long start = walk->next_addr; > + unsigned long end = mm->highest_vm_end; > + > + err = -EBUSY; > + > + rcu_read_lock(); > +#ifdef CONFIG_MEMCG > + if (memcg && atomic_read(&memcg->moving_account)) > + goto contended; > +#endif > + if (!mmap_read_trylock(mm)) > + goto contended; Have you evaluated the behavior under mmap_sem contention? I mean what would be an effect of some mms being excluded from the walk? This path is called from direct reclaim and we do allocate with exclusive mmap_sem IIRC and the trylock can fail in a presence of pending writer if I am not mistaken so even the read lock holder (e.g. an allocation from the #PF) can bypass the walk. Or is this considered statistically insignificant thus a theoretical problem? -- Michal Hocko SUSE Labs