On Fri 22-09-23 07:05:28, Haifeng Xu wrote: > When application in userland receives oom notification from kernel > and reads the oom_control file, it's confusing that under_oom is 0 > though the omm killer hasn't finished. The reason is that under_oom > is cleared before invoking mem_cgroup_out_of_memory(), so move the > action that unmark under_oom after completing oom handling. Therefore, > the value of under_oom won't mislead users. I do not really remember why are we doing it this way but trying to track this down shows that we have been doing that since fb2a6fc56be6 ("mm: memcg: rework and document OOM waiting and wakeup"). So this is an established behavior for 10 years now. Do we really need to change it now? The interface is legacy and hopefully no new workloads are emerging. I agree that the placement is surprising but I would rather not change that unless there is a very good reason for that. Do you have any actual workload which depends on the ordering? And if yes, how do you deal with timing when the consumer of the notification just gets woken up after mem_cgroup_out_of_memory completes? > Signed-off-by: Haifeng Xu <haifeng.xu@xxxxxxxxxx> > --- > mm/memcontrol.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index e8ca4bdcb03c..0b6ed63504ca 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1970,8 +1970,8 @@ static bool mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) > if (locked) > mem_cgroup_oom_notify(memcg); > > - mem_cgroup_unmark_under_oom(memcg); > ret = mem_cgroup_out_of_memory(memcg, mask, order); > + mem_cgroup_unmark_under_oom(memcg); > > if (locked) > mem_cgroup_oom_unlock(memcg); > -- > 2.25.1 -- Michal Hocko SUSE Labs