Re: oom killer rewrite

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 20 May 2010, KAMEZAWA Hiroyuki wrote:

> I've pointed out that "normalized" parameter doesn't seem to work well in some
> situaion (in cluster). I hope you'll have an extra interface as
> 
> 	echo 3G > /proc/<pid>/oom_indemification
> 
> to allow users have "absolute value" setting.
> (If the admin know usual memory usage of an application, we can only
>  add badness to extra memory usage.)
> 
> To be honest, I can't fully understand why we need _normalized_ parameter. Why
> oom_adj _which is now used_ is not enough for setting "relative importance" ?
> 

The only sane badness heuristic will be one that effectively compares all 
eligible tasks for oom kill in a way that are relative to one another; I'm 
concerned that a tunable that is based on a pure memory quantity requires 
specific knowledge of the system (or memcg, cpuset, etc) capacity before 
it is meaningful.  In other words, I opted to use a relative proportion so 
that when tasks are constrained to cpusets or memcgs or mempolicies they 
become part of a "virtualized system" where the proportion is then used in 
calculation of the total amount of system RAM, memcg limit, cpuset mems 
capacities, etc, without knowledge of what that value actually is.  So 
"echo 3G" may be valid in your example when not constrained to any cgroup 
or mempolicy but becomes invalid if I attach it to a cpuset with a single 
node of 1G capacity.  When oom_score_adj, we can specify the proportion 
"of the resources that the application has access to" in comparison to 
other applications that share those resources to determine oom killing 
priority.  I think that's a very powerful interface and your suggestion 
could easily be implemented in userspace with a simple divide, thus we 
don't need kernel support for it.

> And, IIRC, Nick pointed out that "don't remove _used_ interfaces just because
> you hate it or it seems not clean". So, I recommend you to drop sysctl changes.
> 

I addressed this in 
oom-reintroduce-and-deprecate-oom_kill_allocating_task.patch so that the 
end result was that only the oom_dump_tasks sysctl was removed because it 
was now enabled by default since it provides useful information about the 
state of the VM at the time of allocation failure.  So there's no longer 
any removal of "used" interfaces (the only previous use of oom_dump_tasks 
was to enable it, which is now the default).  Are you disagreeing with the 
deprecation of the sysctl since it was then folded into the new 
oom_dump_quick sysctl?

Regardless, I dropped all of these cleanups from my latest patch series 
which I'll post shortly, but keep in mind that what existed in -mm before 
it was dropped did not break any userspace API, it simply deprecated them 
for removal in two years.

> I think the whole concept of your patch series is good and I like it.

Thanks, Kame, for your continued interest in this work, it's encouraging.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]