On Sat, Dec 07, 2013 at 10:12:19AM -0800, Tim Hockin wrote: > You more or less described the fundamental change - a score per memcg, with > a recursive OOM killer which evaluates scores between siblings at the same > level. > > It gets a bit complicated because we have need if wider scoring ranges than > are provided by default If so, I'm sure you can make a convincing case to widen the internal per-task score ranges. The per-memcg score ranges have not even be defined, so this is even easier. > and because we score PIDs against mcgs at a given scope. You are describing bits of a solution, not a problem. And I can't possibly infer a problem from this. > We also have some tiebreaker heuristic (age). Either periodically update the per-memcg score from userspace or implement this in the kernel. We have considered CPU usage history/runtime etc. in the past when picking an OOM victim task. But I'm again just speculating what your problem is, so this may or may not be a feasible solution. > We also have a handful of features that depend on OOM handling like the > aforementioned automatically growing and changing the actual OOM score > depending on usage in relation to various thresholds ( e.g. we sold you X, > and we allow you to go over X but if you do, your likelihood of death in > case of system OOM goes up. You can trivially monitor threshold events from userspace with the existing infrastructure and accordingly update the per-memcg score. > Do you really want us to teach the kernel policies like this? It would be > way easier to do and test in userspace. Maybe. Providing fragments of your solution is not an efficient way to communicate the problem. And you have to sell the problem before anybody can be expected to even consider your proposal as one of the possible solutions. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>