On Wed, 6 Sep 2017, Michal Hocko wrote: > I am not sure this is how things evolved actually. This is way before > my time so my git log interpretation might be imprecise. We do have > oom_badness heuristic since out_of_memory has been introduced and > oom_kill_allocating_task has been introduced much later because of large > boxes with zillions of tasks (SGI I suspect) which took too long to > select a victim so David has added this heuristic. Nope. The logic was required for tasks that run out of memory when the restriction on the allocation did not allow the use of all of memory. cpuset restrictions and memory policy restrictions where the prime considerations at the time. It has *nothing* to do with zillions of tasks. Its amusing that the SGI ghost is still haunting the discussion here. The company died a couple of years ago finally (ok somehow HP has an "SGI" brand now I believe). But there are multiple companies that have large NUMA configurations and they all have configurations where they want to restrict allocations of a process to subset of system memory. This is even more important now that we get new forms of memory (NVDIMM, PCI-E device memory etc). You need to figure out what to do with allocations that fail because the *allowed* memory pools are empty. -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html