On Tue, 3 Aug 2010, KAMEZAWA Hiroyuki wrote: > Hmm, then, oom_score shows the values for all limitations in array ? > /proc/pid/oom_score will change if a task's cpuset, memcg, or mempolicy attachment changes or its mems, nodes, or limit changes because it's a proportion of available memory. /proc/pid/oom_score_adj stays constant such that the oom killing priority of that task relative to other tasks sharing the same constraints and competing for the same memory is the same. The point is that it doesn't matter how much memory a task has available, but rather what it's priority is with the tasks that compete with it for memory. You could, of course, do some simple arithmetic to write a memory quantity to oom_score_adj if you really wanted to, but that would force the user to recalculate the value anytime the task's cpuset, mempolicy, or memcg changes. > > > Usual disto alreay enables it. > > > > > > > Yes, I'm well aware of my 40MB of lost memory on my laptop :) > > > Very sorry ;) > But it's required to track memory usage from init... > Memcg comes with a cost of ~1% of system memory on x86 since struct page_cgroup is ~1% of a 4K page. That means if we were to deploy memcg on all of our servers and the number of jobs we can run is constrained only by memory, it's equivalent to losing ~1% of our servers. That, for us, is very large. This is a different topic entirely, but it's a very significant disadvantage and enough that most people who care about oom killing prioritization aren't going to wany to incur such an overhead by enabling memcg or setting up individual memcg for each and every job, because that requires specific knowledge of all those jobs. I'm not by any means proposing oom_score_adj as being very popular for the usual desktop environments :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>