On 10/27/2017 10:05 PM, Johannes Weiner wrote: > On Thu, Oct 26, 2017 at 02:03:41PM -0700, David Rientjes wrote: >> On Thu, 26 Oct 2017, Johannes Weiner wrote: >> >>>> The nack is for three reasons: >>>> >>>> (1) unfair comparison of root mem cgroup usage to bias against that mem >>>> cgroup from oom kill in system oom conditions, >>>> >>>> (2) the ability of users to completely evade the oom killer by attaching >>>> all processes to child cgroups either purposefully or unpurposefully, >>>> and >>>> >>>> (3) the inability of userspace to effectively control oom victim >>>> selection. >>> My apologies if my summary was too reductionist. >>> >>> That being said, the arguments you repeat here have come up in >>> previous threads and been responded to. This doesn't change my >>> conclusion that your NAK is bogus. >> They actually haven't been responded to, Roman was working through v11 and >> made a change on how the root mem cgroup usage was calculated that was >> better than previous iterations but still not an apples to apples >> comparison with other cgroups. The problem is that it the calculation for >> leaf cgroups includes additional memory classes, so it biases against >> processes that are moved to non-root mem cgroups. Simply creating mem >> cgroups and attaching processes should not independently cause them to >> become more preferred: it should be a fair comparison between the root mem >> cgroup and the set of leaf mem cgroups as implemented. That is very >> trivial to do with hierarchical oom cgroup scoring. > There is absolutely no value in your repeating the same stuff over and > over again without considering what other people are telling you. > > Hierarchical oom scoring has other downsides, and most of us agree > that they aren't preferable over the differences in scoring the root > vs scoring other cgroups - in particular because the root cannot be > controlled, doesn't even have local statistics, and so is unlikely to > contain important work on a containerized system. Getting the ballpark > right for the vast majority of usecases is more than good enough here. > >> Since the ability of userspace to control oom victim selection is not >> addressed whatsoever by this patchset, and the suggested method cannot be >> implemented on top of this patchset as you have argued because it requires >> a change to the heuristic itself, the patchset needs to become complete >> before being mergeable. > It is complete. It just isn't a drop-in replacement for what you've > been doing out-of-tree for years. Stop making your problem everybody > else's problem. > > You can change the the heuristics later, as you have done before. Or > you can add another configuration flag and we can phase out the old > mode, like we do all the time. > I think this problem is related to the removal of the lowmemorykiller, where this is the life-line when the user-space for some reason fails. So I guess quite a few will have this problem. -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html