On Tue, 1 Jun 2010, Nick Piggin wrote: > > This a complete rewrite of the oom killer's badness() heuristic which is > > used to determine which task to kill in oom conditions. The goal is to > > make it as simple and predictable as possible so the results are better > > understood and we end up killing the task which will lead to the most > > memory freeing while still respecting the fine-tuning from userspace. > > Do you have particular ways of testing this (and other heuristics > changes such as the forkbomb detector)? > Yes, the patch prior to this one in the series, "oom: enable oom tasklist dump by default", allows you to examine the oom_score_adj of all eligible tasks. Used in combination with /proc/pid/oom_score, which reports the result of the badness heuristic to userspace, I tested the result of the change by ensuring that it worked as intended. Since we'll now see a tasklist dump of all eligible tasks whenever someone reports an oom problem (hopefully fewer reports as a result of this rewrite than currently!), it's much easier to determine (i) why the oom killer was called, and (ii) why a particular task was chosen for kill. That's been my testing philosophy. The forkbomb detector does add a minimal bias to tasks that have a large number of execve children, just as the current oom killer does (although the bias is much smaller with my heursitic). Rik and I had a lengthy conversation on linux-mm about that when it was first proposed. The key to that particular bias is that you must remember that even though a task is selected for oom kill that the oom killer still attempts to kill an execve child first. So the end result is that an important system daemon, such as a webserver, doesn't actually get oom killed when it's selected as a result of this, but it's more of a bias toward the children to be killed (a client) instead. We're guaranteed that a child will be killed if a task is chosen as the result of a tiebreaker because of the forkbomb detector because it surely has a child with a different mm that is eligible. This isn't meant to be enforce a kernel-wide forkbomb policy, which would obviously be better implemented elsewhere, but rather bias the children when a parent is forking an egregiously large number of tasks. "Egregious" in this case is defined as whatever the user uses for oom_forkbomb_thres, which I believe defaults to a sane value of 1000. > Such that you can look at your test case or workload and see that > it is really improved? > I'm glad you asked that because some recent conversation has been slightly confusing to me about how this affects the desktop; this rewrite significantly improves the oom killer's response for desktop users. The core ideas were developed in the thread from this mailing list back in February called "Improving OOM killer" at http://marc.info/?t=126506191200004&r=4&w=2 -- users constantly report that vital system tasks such as kdeinit are killed whenever a memory hogging task is forked either intentionally or unintentionally. I argued for a while that KDE should be taking proper precautions by adjusting its own oom_adj score and that of its forked children as it's an inherited value, but I was eventually convinced that an overall improvement to the heuristic must be made to kill a task that was known to free a large amount of memory that is resident in RAM and that we have a consistent way of defining oom priorities when a task is run uncontained and when it is a member of a memcg or cpuset (or even mempolicy now), even in the case when it's contained out from under the task's knowledge. When faced with memory pressure from an out of control or memory hogging task on the desktop, the oom killer now kills it instead of a vital task such as an X server (and oracle, webserver, etc on server platforms) because of the use of the task's rss instead of total_vm statistic. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>