On Thu 07-09-17 12:14:57, Johannes Weiner wrote: > On Wed, Sep 06, 2017 at 10:28:59AM +0200, Michal Hocko wrote: > > On Tue 05-09-17 17:53:44, Johannes Weiner wrote: > > > The cgroup-awareness in the OOM killer is exactly the same thing. It > > > should have been the default from the beginning, because the user > > > configures a group of tasks to be an interdependent, terminal unit of > > > memory consumption, and it's undesirable for the OOM killer to ignore > > > this intention and compare members across these boundaries. > > > > I would agree if that was true in general. I can completely see how the > > cgroup awareness is useful in e.g. containerized environments (especially > > with kill-all enabled) but memcgs are used in a large variety of > > usecases and I cannot really say all of them really demand the new > > semantic. Say I have a workload which doesn't want to see reclaim > > interference from others on the same machine. Why should I kill a > > process from that particular memcg just because it is the largest one > > when there is a memory hog/leak outside of this memcg? > > Sure, it's always possible to come up with a config for which this > isn't the optimal behavior. But this is about picking a default that > makes sense to most users, and that type of cgroup usage just isn't > the common case. How can you tell, really? Even if cgroup2 is a new interface we still want as many legacy (v1) users to be migrated to the new hierarchy. I have seen quite different usecases over time and I have hard time to tell which of them to call common enough. > > From my point of view the safest (in a sense of the least surprise) > > way to go with opt-in for the new heuristic. I am pretty sure all who > > would benefit from the new behavior will enable it while others will not > > regress in unexpected way. > > This thinking simply needs to be balanced against the need to make an > unsurprising and consistent final interface. Sure. And I _think_ we can come up with a clear interface to configure the oom behavior - e.g. a kernel command line parameter with a default based on a config option. > The current behavior breaks isolation by letting tasks in different > cgroups compete with each other during an OOM kill. While you can > rightfully argue that it's possible for usecases to rely on this, you > cannot tell me that this is the least-surprising thing we can offer > users; certainly not new users, but also not many/most existing ones. I would argue that a global OOM has been always a special case and people got used to "kill the largest task" strategy. I have seen multiple reports where people were complaining when this wasn't the case (e.g. when the NUMA policies were involved). > > We can talk about the way _how_ to control these oom strategies, of > > course. But I would be really reluctant to change the default which is > > used for years and people got used to it. > > I really doubt there are many cgroup users that rely on that > particular global OOM behavior. > > We have to agree to disagree, I guess. Yes, I am afraid so. And I do not hear this would be a feature so many users have been asking for a long time to simply say "yeah everybody wants that, make it a default". And as such I do not see a reason why we should enforce it on all users. It is really trivial to enable it when it is considered useful. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html