On Tue, 8 Jun 2010, Andrew Morton wrote: > > Tasks that do not share the same set of allowed nodes with the task that > > triggered the oom should not be considered as candidates for oom kill. > > > > Tasks in other cpusets with a disjoint set of mems would be unfairly > > penalized otherwise because of oom conditions elsewhere; an extreme > > example could unfairly kill all other applications on the system if a > > single task in a user's cpuset sets itself to OOM_DISABLE and then uses > > more memory than allowed. > > OK, so Nick's change didn't anticipate things being set to OOM_DISABLE? > I wrote out a more elaborate rebuttal to this in your reply to my latest patchset, but not strictly eliminating these tasks from consideration unfairly penalizes tasks in other cpusets simply because their big, there's no way to understand the scale of other cpusets compared to current's with a single divide in the heuristic (in this case, divide by 8), and there's no guarantee that killing such a task would free any memory which would have two results: (i) we need to reinvoke the oom killer to kill yet another task, and (ii) we've now unnecessarily killed a task simply because it was large and probably lost a substantial amount of work. > OOM_DISABLE seems pretty dangerous really - allows malicious > unprivileged users to go homicidal? > OOM_DISABLE doesn't get set without CAP_SYS_RESOURCE, you need that capability to decrease an oom_adj value. So my changelog could probably benefit from s/user/job/. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>