On Fri 10-09-21 15:39:28, Vasily Averin wrote: > The kernel currently allows dying tasks to exceed the memcg limits. > The allocation is expected to be the last one and the occupied memory > will be freed soon. > This is not always true because it can be part of the huge vmalloc > allocation. Allowed once, they will repeat over and over again. > Moreover lifetime of the allocated object can differ from > In addition the lifetime of the dying task. > Multiple such allocations running concurrently can not only overuse > the memcg limit, but can lead to a global out of memory and, > in the worst case, cause the host to panic. > > Signed-off-by: Vasily Averin <vvs@xxxxxxxxxxxxx> > --- > mm/memcontrol.c | 23 +++++------------------ > 1 file changed, 5 insertions(+), 18 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 389b5766e74f..67195fcfbddf 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1834,6 +1834,9 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int > return OOM_ASYNC; > } > > + if (should_force_charge()) > + return OOM_SKIPPED; mem_cgroup_out_of_memory already check for the bypass, now you are duplicating that check with a different answer to the caller. This is really messy. One of the two has to go away. -- Michal Hocko SUSE Labs