On Mon, 2 Dec 2013, Michal Hocko wrote: > I guess we need to know how much is significantly less. > oom_scan_process_thread already aborts on exiting tasks so we do not > kill anything and then the charge (whole page fault actually) is retried > when we check for the OOM again so my intuition would say that we gave > the exiting task quite a lot of time. > That isn't the race, though. The race occurs when the oom killed process exits prior to the process iteration so it's not detected and yet its memory has already been freed and the memcg is no longer oom. In other words, a process that has called mem_cgroup_oom_synchronize() at the same time that an oom killed process has freed its memory. The result is an unnecessary oom killing and erroneous spam in the kernel log. We all agree that this race cannot be completely closed (at least without synchronization in the uncharge path that we obviously don't want to add). We don't know if an oom killed process, or any process, will free its memory immediately after the kernel sends the SIGKILL. However, there's absolutely no reason to not have a final check immediately before sending the SIGKILL to prevent that unnecessary oom kill. I'm going to send the patch for review. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>