On Sat 01-06-13 02:11:51, Johannes Weiner wrote: [...] > I'm currently messing around with the below patch. When a task faults > and charges under OOM, the memcg is remembered in the task struct and > then made to sleep on the memcg's OOM waitqueue only after unwinding > the page fault stack. With the kernel OOM killer disabled, all tasks > in the OOMing group sit nicely in > > mem_cgroup_oom_synchronize > pagefault_out_of_memory > mm_fault_error > __do_page_fault > page_fault > 0xffffffffffffffff > > regardless of whether they were faulting anon or file. They do not > even hold the mmap_sem anymore at this point. > > [ I kept syscalls really simple for now and just have them return > -ENOMEM, never trap them at all (just like the global OOM case). > It would be more work to have them wait from a flatter stack too, > but it should be doable if necessary. ] > > I suggested this at the MM summit and people were essentially asking > if I was feeling well, so maybe I'm still missing a gaping hole in > this idea. I didn't get to look at the patch (will do on Monday) but it doesn't sounds entirely crazy. Well, we would have to drop mmap_sem so things have to be rechecked but we are doing that already with VM_FAULT_RETRY in some archs so it should work. > Patch only works on x86 as of now, on other architectures memcg OOM > will invoke the global OOM killer. [...] -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>