On Fri 31-01-14 12:36:20, Andrew Morton wrote: > Subject: [obsolete] memcg-do-not-hang-on-oom-when-killed-by-userspace-oom-access-to-memory-reserves.patch removed from -mm tree > To: mhocko@xxxxxxx,ebiederm@xxxxxxxxxxxx,hannes@xxxxxxxxxxx,kamezawa.hiroyu@xxxxxxxxxxxxxx,rientjes@xxxxxxxxxx,stable@xxxxxxxxxxxxxxx,mm-commits@xxxxxxxxxxxxxxx > From: akpm@xxxxxxxxxxxxxxxxxxxx > Date: Fri, 31 Jan 2014 12:36:20 -0800 > > > The patch titled > Subject: memcg: do not hang on OOM when killed by userspace OOM access to memory reserves > has been removed from the -mm tree. Its filename was > memcg-do-not-hang-on-oom-when-killed-by-userspace-oom-access-to-memory-reserves.patch > > This patch was dropped because it is obsolete What has made the patch obsolete? I do not see any alternative merged in the Linus' tree. > ------------------------------------------------------ > From: Michal Hocko <mhocko@xxxxxxx> > Subject: memcg: do not hang on OOM when killed by userspace OOM access to memory reserves > > Eric has reported that he can see task(s) stuck in memcg OOM handler > regularly. The only way out is to > > echo 0 > $GROUP/memory.oom_controll > > His usecase is: > > - Setup a hierarchy with memory and the freezer (disable kernel oom and > have a process watch for oom). > > - In that memory cgroup add a process with one thread per cpu. > > - In one thread slowly allocate once per second I think it is 16M of ram > and mlock and dirty it (just to force the pages into ram and stay > there). > > - When oom is achieved loop: > * attempt to freeze all of the tasks. > * if frozen send every task SIGKILL, unfreeze, remove the directory in > cgroupfs. > > Eric has then pinpointed the issue to be memcg specific. > > All tasks are sitting on the memcg_oom_waitq when memcg oom is disabled. > Those that have received fatal signal will bypass the charge and should > continue on their way out. The tricky part is that the exit path might > trigger a page fault (e.g. exit_robust_list), thus the memcg charge, > while its memcg is still under OOM because nobody has released any charges > yet. > > Unlike with the in-kernel OOM handler the exiting task doesn't get > TIF_MEMDIE set so it doesn't shortcut futher charges of the killed task > and falls to the memcg OOM again without any way out of it as there are no > fatal signals pending anymore. > > This patch fixes the issue by checking PF_EXITING early in > __mem_cgroup_try_charge and bypass the charge same as if it had fatal > signal pending or TIF_MEMDIE set. > > Normally exiting tasks (aka not killed) will bypass the charge now but > this should be OK as the task is leaving and will release memory and > increasing the memory pressure just to release it in a moment seems > dubious wasting of cycles. Besides that charges after exit_signals should > be rare. > > Reported-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> > Signed-off-by: Michal Hocko <mhocko@xxxxxxx> > Cc: David Rientjes <rientjes@xxxxxxxxxx> > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > --- > > mm/memcontrol.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff -puN mm/memcontrol.c~memcg-do-not-hang-on-oom-when-killed-by-userspace-oom-access-to-memory-reserves mm/memcontrol.c > --- a/mm/memcontrol.c~memcg-do-not-hang-on-oom-when-killed-by-userspace-oom-access-to-memory-reserves > +++ a/mm/memcontrol.c > @@ -2670,7 +2670,8 @@ static int __mem_cgroup_try_charge(struc > * MEMDIE process. > */ > if (unlikely(test_thread_flag(TIF_MEMDIE) > - || fatal_signal_pending(current))) > + || fatal_signal_pending(current)) > + || current->flags & PF_EXITING) > goto bypass; > > if (unlikely(task_in_memcg_oom(current))) > _ > > Patches currently in -mm which might be from mhocko@xxxxxxx are > > origin.patch > mm-vmscan-respect-numa-policy-mask-when-shrinking-slab-on-direct-reclaim.patch > mm-vmscan-move-call-to-shrink_slab-to-shrink_zones.patch > mm-vmscan-remove-shrink_control-arg-from-do_try_to_free_pages.patch > -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html