On 2023/9/25 15:57, Michal Hocko wrote: > On Fri 22-09-23 07:05:28, Haifeng Xu wrote: >> When application in userland receives oom notification from kernel >> and reads the oom_control file, it's confusing that under_oom is 0 >> though the omm killer hasn't finished. The reason is that under_oom >> is cleared before invoking mem_cgroup_out_of_memory(), so move the >> action that unmark under_oom after completing oom handling. Therefore, >> the value of under_oom won't mislead users. > > I do not really remember why are we doing it this way but trying to track > this down shows that we have been doing that since fb2a6fc56be6 ("mm: > memcg: rework and document OOM waiting and wakeup"). So this is an > established behavior for 10 years now. Do we really need to change it > now? The interface is legacy and hopefully no new workloads are > emerging. > > I agree that the placement is surprising but I would rather not change > that unless there is a very good reason for that. Do you have any actual > workload which depends on the ordering? And if yes, how do you deal with > timing when the consumer of the notification just gets woken up after > mem_cgroup_out_of_memory completes? yes, when the oom event is triggered, we check the under_oom every 10 seconds. If it is cleared, then we create a new process with less memory allocation to avoid oom again. > >> Signed-off-by: Haifeng Xu <haifeng.xu@xxxxxxxxxx> >> --- >> mm/memcontrol.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index e8ca4bdcb03c..0b6ed63504ca 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c >> @@ -1970,8 +1970,8 @@ static bool mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order) >> if (locked) >> mem_cgroup_oom_notify(memcg); >> >> - mem_cgroup_unmark_under_oom(memcg); >> ret = mem_cgroup_out_of_memory(memcg, mask, order); >> + mem_cgroup_unmark_under_oom(memcg); >> >> if (locked) >> mem_cgroup_oom_unlock(memcg); >> -- >> 2.25.1 >