Re: [PATCH 1/2] memcg, oom: unmark under_oom after the oom killer is done

Michal Hocko <mhocko@xxxxxxxx> · Mon, 25 Sep 2023 09:57:48 +0200

On Fri 22-09-23 07:05:28, Haifeng Xu wrote:
> When application in userland receives oom notification from kernel
> and reads the oom_control file, it's confusing that under_oom is 0
> though the omm killer hasn't finished. The reason is that under_oom
> is cleared before invoking mem_cgroup_out_of_memory(), so move the
> action that unmark under_oom after completing oom handling. Therefore,
> the value of under_oom won't mislead users.

I do not really remember why are we doing it this way but trying to track
this down shows that we have been doing that since fb2a6fc56be6 ("mm:
memcg: rework and document OOM waiting and wakeup"). So this is an
established behavior for 10 years now. Do we really need to change it
now? The interface is legacy and hopefully no new workloads are
emerging.

I agree that the placement is surprising but I would rather not change
that unless there is a very good reason for that. Do you have any actual
workload which depends on the ordering? And if yes, how do you deal with
timing when the consumer of the notification just gets woken up after
mem_cgroup_out_of_memory completes?

> Signed-off-by: Haifeng Xu <haifeng.xu@xxxxxxxxxx>
> ---
>  mm/memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index e8ca4bdcb03c..0b6ed63504ca 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1970,8 +1970,8 @@ static bool mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
>  	if (locked)
>  		mem_cgroup_oom_notify(memcg);
>  
> -	mem_cgroup_unmark_under_oom(memcg);
>  	ret = mem_cgroup_out_of_memory(memcg, mask, order);
> +	mem_cgroup_unmark_under_oom(memcg);
>  
>  	if (locked)
>  		mem_cgroup_oom_unlock(memcg);
> -- 
> 2.25.1

-- 
Michal Hocko
SUSE Labs