Re: [patch] mm, memcg: add oom killer delay

Michal Hocko <mhocko@xxxxxxx> · Sat, 1 Jun 2013 12:29:05 +0200

On Sat 01-06-13 02:11:51, Johannes Weiner wrote:
[...]
> I'm currently messing around with the below patch.  When a task faults
> and charges under OOM, the memcg is remembered in the task struct and
> then made to sleep on the memcg's OOM waitqueue only after unwinding
> the page fault stack.  With the kernel OOM killer disabled, all tasks
> in the OOMing group sit nicely in
> 
>   mem_cgroup_oom_synchronize
>   pagefault_out_of_memory
>   mm_fault_error
>   __do_page_fault
>   page_fault
>   0xffffffffffffffff
> 
> regardless of whether they were faulting anon or file.  They do not
> even hold the mmap_sem anymore at this point.
> 
> [ I kept syscalls really simple for now and just have them return
>   -ENOMEM, never trap them at all (just like the global OOM case).
>   It would be more work to have them wait from a flatter stack too,
>   but it should be doable if necessary. ]
> 
> I suggested this at the MM summit and people were essentially asking
> if I was feeling well, so maybe I'm still missing a gaping hole in
> this idea.

I didn't get to look at the patch (will do on Monday) but it doesn't
sounds entirely crazy. Well, we would have to drop mmap_sem so things
have to be rechecked but we are doing that already with VM_FAULT_RETRY
in some archs so it should work.

> Patch only works on x86 as of now, on other architectures memcg OOM
> will invoke the global OOM killer.
[...]
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html