On Tue, 3 Aug 2010, KAMEZAWA Hiroyuki wrote: > In old behavior, oom_score order is synchronous both in the system and > container. High-score one will be killed. > IOW, oom_score have worked as oom_score. > This isn't necessarily true as I've already pointed out: the highest score as exported by /proc/pid/oom_score is not always killed if it's not a candidate task: it may be in a disjoint memcg, for example. The highest _candidate_ task is killed, and that's unchanged with my rewrite. The current /proc/pid/oom_score is also not synchronous between the system and container at least in the cpuset case since we currently divide a task's score by 8 if it doesn't intersect current's mems_allowed, so that's not true either. > But, after the patch, the user (of LXC at el.) can't trust oom_score. Yes, they can, but they need to know the context in which the oom occurs. /proc/pid/oom_score cannot export multiple values although its kill ranking actually depends on whether its a system oom, memcg oom, cpuset oom, etc. It needs to export a single value as a function of the heuristic. The user must then take those values at the time of collection and find how the various tasks rank relative to one another depending on MPOL_BIND, cpuset hierarchy, etc. That's actually not that difficult because admins who don't use any cgroups typically only have system-wide ooms where oom_score is always accurate and admins who use cpusets or memcg or mempolicies on large NUMA systems already know the set of tasks that are attached to them and want to prioritize the killing list specifically for those entities. > Especially with memcg, it just shows a _broken_ value. > Not at all, the user knows what tasks are attached to the memcg and can easily determine which task is going to be killed when it ooms: simply iterate through the memcg tasklist, check /proc/pid/oom_score, and sort. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>