On Fri, Sep 15, 2017 at 12:58:26PM +0200, Michal Hocko wrote: > On Thu 14-09-17 09:05:48, Roman Gushchin wrote: > > On Thu, Sep 14, 2017 at 03:40:14PM +0200, Michal Hocko wrote: > > > On Wed 13-09-17 14:56:07, Roman Gushchin wrote: > > > > On Wed, Sep 13, 2017 at 02:29:14PM +0200, Michal Hocko wrote: > > > [...] > > > > > I strongly believe that comparing only leaf memcgs > > > > > is more straightforward and it doesn't lead to unexpected results as > > > > > mentioned before (kill a small memcg which is a part of the larger > > > > > sub-hierarchy). > > > > > > > > One of two main goals of this patchset is to introduce cgroup-level > > > > fairness: bigger cgroups should be affected more than smaller, > > > > despite the size of tasks inside. I believe the same principle > > > > should be used for cgroups. > > > > > > Yes bigger cgroups should be preferred but I fail to see why bigger > > > hierarchies should be considered as well if they are not kill-all. And > > > whether non-leaf memcgs should allow kill-all is not entirely clear to > > > me. What would be the usecase? > > > > We definitely want to support kill-all for non-leaf cgroups. > > A workload can consist of several cgroups and we want to clean up > > the whole thing on OOM. > > Could you be more specific about such a workload? E.g. how can be such a > hierarchy handled consistently when its sub-tree gets killed due to > internal memory pressure? Or just system-wide OOM. > Or do you expect that none of the subtree will > have hard limit configured? And this can also be a case: the whole workload may have hard limit configured, while internal memcgs have only memory.low set for "soft" prioritization. > > But then you just enforce a structural restriction on your configuration > because > root > / \ > A D > /\ > B C > > is a different thing than > root > / | \ > B C D > I actually don't have a strong argument against an approach to select largest leaf or kill-all-set memcg. I think, in practice there will be no much difference. The only real concern I have is that then we have to do the same with oom_priorities (select largest priority tree-wide), and this will limit an ability to enforce the priority by parent cgroup. Thanks! -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html