On Wed 25-03-15 02:17:10, Johannes Weiner wrote: > The zonelist locking and the oom_sem are two overlapping locks that > are used to serialize global OOM killing against different things. > > The historical zonelist locking serializes OOM kills from allocations > with overlapping zonelists against each other to prevent killing more > tasks than necessary in the same memory domain. Only when neither > tasklists nor zonelists from two concurrent OOM kills overlap (tasks > in separate memcgs bound to separate nodes) are OOM kills allowed to > execute in parallel. > > The younger oom_sem is a read-write lock to serialize OOM killing > against the PM code trying to disable the OOM killer altogether. > > However, the OOM killer is a fairly cold error path, there is really > no reason to optimize for highly performant and concurrent OOM kills. > And the oom_sem is just flat-out redundant. > > Replace both locking schemes with a single global mutex serializing > OOM kills regardless of context. OK, this is much simpler. You have missed drivers/tty/sysrq.c which should take the lock as well. ZONE_OOM_LOCKED can be removed as well. __out_of_memory in the kerneldoc should be renamed. [...] > @@ -795,27 +728,21 @@ bool out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, > */ > void pagefault_out_of_memory(void) > { > - struct zonelist *zonelist; > - > - down_read(&oom_sem); > if (mem_cgroup_oom_synchronize(true)) > - goto unlock; > + return; OK, so we are back to what David has asked previously. We do not need the lock for memcg and oom_killer_disabled because we know that no tasks (except for potential oom victim) are lurking around at the time oom_killer_disable() is called. So I guess we want to stick a comment into mem_cgroup_oom_synchronize before we check for oom_killer_disabled. After those are fixed, feel free to add Acked-by: Michal Hocko <mhocko@xxxxxxx> -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html