> Yes and nobody is disputing that, really. I guess the main disconnect > here is that different people want to have more detailed control over > the victim selection while the patchset tries to handle the most > simplistic scenario when a no userspace control over the selection is > required. And I would claim that this will be a last majority of setups > and we should address it first. IMHO the disconnect/disagreement is which memcgs should be compared with each other for oom victim selection. Let's forget about oom priority and just take size into the account. Should the oom selection algorithm, compare the leaves of the hierarchy or should it compare siblings? For the single user system, comparing leaves makes sense while in a multi user system, siblings should be compared for victim selection. Coming back to the same example: root / \ A D / \ B C Let's view it as a multi user system and some central job scheduler has asked a node controller on this system to start two jobs 'A' & 'D'. 'A' then went on to create sub-containers. Now, on system oom, IMO the most simple sensible thing to do from the semantic point of view is to compare 'A' and 'D' and if 'A''s usage is higher then killall 'A' if oom_group or recursively find victim memcg taking 'A' as root. I have noted before that for single user systems, comparing 'B', 'C' & 'D' is the most sensible thing to do. Now, in the multi user system, I can kind of force the comparison of 'A' & 'D' by setting oom_group on 'A'. IMO that is abuse of 'oom_group' as it will get double meanings/semantics which are comparison leader and killall. I would humbly suggest to have two separate notions instead. Let's say oom_gang (if you prefer just 'oom_group' is fine too) and killall. For the single user system example, 'B', 'C' and 'D' will have 'oom_gang' set and if the user wants killall semantics too, he can set it separately. For the multi user, 'A' and 'D' will have 'oom_gang' set. Now, lets say 'A' was selected on system oom, if 'killall' was set on 'A' then 'A' will be selected as victim otherwise the oom selection algorithm will recursively take 'A' as root and try to find victim memcg. Another major semantic of 'oom_gang' is that the leaves will always be treated as 'oom_gang'. -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html