Re: [PATCH memcg] memcg: prohibit unconditional exceeding the limit of dying tasks

Michal Hocko <mhocko@xxxxxxxx> · Mon, 13 Sep 2021 10:53:35 +0200

On Fri 10-09-21 15:39:28, Vasily Averin wrote:
> The kernel currently allows dying tasks to exceed the memcg limits.
> The allocation is expected to be the last one and the occupied memory
> will be freed soon.
> This is not always true because it can be part of the huge vmalloc
> allocation. Allowed once, they will repeat over and over again.
> Moreover lifetime of the allocated object can differ from
> In addition the lifetime of the dying task.
> Multiple such allocations running concurrently can not only overuse
> the memcg limit, but can lead to a global out of memory and,
> in the worst case, cause the host to panic.
> 
> Signed-off-by: Vasily Averin <vvs@xxxxxxxxxxxxx>
> ---
>  mm/memcontrol.c | 23 +++++------------------
>  1 file changed, 5 insertions(+), 18 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 389b5766e74f..67195fcfbddf 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1834,6 +1834,9 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
>  		return OOM_ASYNC;
>  	}
>  
> +	if (should_force_charge())
> +		return OOM_SKIPPED;

mem_cgroup_out_of_memory already check for the bypass, now you are
duplicating that check with a different answer to the caller. This is
really messy. One of the two has to go away.

-- 
Michal Hocko
SUSE Labs