You more or less described the fundamental change - a score per memcg, with a recursive OOM killer which evaluates scores between siblings at the same level.
It gets a bit complicated because we have need if wider scoring ranges than are provided by default and because we score PIDs against mcgs at a given scope. We also have some tiebreaker heuristic (age).
We also have a handful of features that depend on OOM handling like the aforementioned automatically growing and changing the actual OOM score depending on usage in relation to various thresholds ( e.g. we sold you X, and we allow you to go over X but if you do, your likelihood of death in case of system OOM goes up.
Do you really want us to teach the kernel policies like this? It would be way easier to do and test in userspace.
Tim