Re: [patch -mm 1/2] oom: badness heuristic rewrite

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 3 Aug 2010, KAMEZAWA Hiroyuki wrote:

> In old behavior, oom_score order is synchronous both in the system and
> container. High-score one will be killed.
> IOW, oom_score have worked as oom_score.
> 

This isn't necessarily true as I've already pointed out: the highest score 
as exported by /proc/pid/oom_score is not always killed if it's not a 
candidate task: it may be in a disjoint memcg, for example.  The highest 
_candidate_ task is killed, and that's unchanged with my rewrite.

The current /proc/pid/oom_score is also not synchronous between the system 
and container at least in the cpuset case since we currently divide a 
task's score by 8 if it doesn't intersect current's mems_allowed, so 
that's not true either.

> But, after the patch,  the user (of LXC at el.) can't trust oom_score. 

Yes, they can, but they need to know the context in which the oom occurs.  
/proc/pid/oom_score cannot export multiple values although its kill 
ranking actually depends on whether its a system oom, memcg oom, cpuset 
oom, etc.  It needs to export a single value as a function of the 
heuristic.  The user must then take those values at the time of 
collection and find how the various tasks rank relative to one another 
depending on MPOL_BIND, cpuset hierarchy, etc.  That's actually not that 
difficult because admins who don't use any cgroups typically only have 
system-wide ooms where oom_score is always accurate and admins who use 
cpusets or memcg or mempolicies on large NUMA systems already know the set 
of tasks that are attached to them and want to prioritize the killing list 
specifically for those entities.

> Especially with memcg, it just shows a _broken_ value.
> 

Not at all, the user knows what tasks are attached to the memcg and can 
easily determine which task is going to be killed when it ooms: simply 
iterate through the memcg tasklist, check /proc/pid/oom_score, and sort.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]