Re: [patch -mm 1/2] oom: badness heuristic rewrite

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 3 Aug 2010 10:08:15 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:

> On Mon, 2 Aug 2010 18:02:48 -0700 (PDT)
> David Rientjes <rientjes@xxxxxxxxxx> wrote:
> 
> > On Tue, 3 Aug 2010, KAMEZAWA Hiroyuki wrote:
> > 
> > > > > Then, an applications' oom_score on a host is quite different from on the other
> > > > > host. This operation is very new rather than a simple interface updates.
> > > > > This opinion was rejected.
> > > > > 
> > > > 
> > > > It wasn't rejected, I responded to your comment and you never wrote back.  
> > > > The idea 
> > > > 
> > > I just got tired to write the same thing in many times. And I don't have
> > > strong opinions. I _know_ your patch fixes X-server problem. That was enough
> > > for me.
> > > 
> > 
> > There're a couple of reasons why I disagree that oom_score_adj should have 
> > memory quantity units.
> > 
> > First, individual oom scores that come out of oom_badness() don't mean 
> > anything in isolation, they only mean something when compared to other 
> > candidate tasks.  All applications, whether attached to a cpuset, a 
> > mempolicy, a memcg, or not, have an allowed set of memory and applications 
> > that are competing for those shared resources.  When defining what 
> > application happens to be the most memory hogging, which is the one we 
> > want to kill, they are ranked amongst themselves.  Using oom_score_adj as 
> > a proportion, we can say a particular application should be allowed 25% of 
> > resources, other applications should be allowed 5%, and others should be 
> > penalized 10%, for example.  This makes prioritization for oom kill rather 
> > simple.
> > 
> > Second, we don't want to adjust oom_score_adj anytime a task is attached 
> > to a cpuset, a mempolicy, or a memcg, or whenever those cpuset's mems 
> > changes, the bound mempolicy nodemask changes, or the memcg limit changes.  
> > The application need not know what that set of allowed memory is and the 
> > kernel should operate seemlessly regardless of what the attachment is.  
> > These are, in a sense, "virtualized" systems unto themselves: if a task is 
> > moved from a child cpuset to the root cpuset, it's set of allowed memory 
> > may become much larger.  That action shouldn't need to have an equivalent 
> > change to /proc/pid/oom_score_adj: the priority of the task relative to 
> > its other competing tasks is the same.  That set of allowed memory may 
> > change, but its priority does not unless explicitly changed by the admin.
> > 
> 
> Hmm, then, oom_score shows the values for all limitations in array ?
> 
Anyway, the fact "oom_score can be changed by the context of OOM" may
confuse admins. "OMG, why low oom_score application is killed! Shit!"

Please add additional cares for users if we go this way or remove
user visible oom_score file from /proc.

Thanks,
-kame



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]