Re: [RESEND v12 0/6] cgroup-aware OOM killer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 19 Oct 2017, Johannes Weiner wrote:

> David would have really liked for this patchset to include knobs to
> influence how the algorithm picks cgroup victims. The rest of us
> agreed that this is beyond the scope of these patches, that the
> patches don't need it to be useful, and that there is nothing
> preventing anyone from adding configurability later on. David
> subsequently nacked the series as he considers it incomplete. Neither
> Michal nor I see technical merit in David's nack.
> 

The nack is for three reasons:

 (1) unfair comparison of root mem cgroup usage to bias against that mem 
     cgroup from oom kill in system oom conditions,

 (2) the ability of users to completely evade the oom killer by attaching
     all processes to child cgroups either purposefully or unpurposefully,
     and

 (3) the inability of userspace to effectively control oom victim  
     selection.

For (1), the difference in v12 is adding the rss of all processes attached 
to the root mem cgroup as the evaluation.  This is not the same criteria 
that child cgroups are evaluated on, and they are compared using that 
bias.  It is very trivial to provide a fair comparison as was suggested in 
v11.

For (2), users who do

	for i in $(cat cgroup.procs); do mkdir $i; echo $i > $i/cgroup.procs; done

can completely evade the oom killer and this may be part of their standard 
operating procedure for restricting resources with other cgroups other 
than the mem cgroup.  Again, it's an unfair comparison to all other 
cgroups on the system.

For (3), users need the ability to protect important cgroups, such as 
protecting a cgroup that is limited to 50% of system memory.  They need 
the ability to kill processes from other cgroups to protect these 
important processes.  This is nothing new: the oom killer has always 
provided the ability to bias against important processes.

There was follow-up email on all of these points where very trivial 
changes were suggested to address all three of these issues, and which 
Roman has implemented in one form or another in previous iterations with 
the bonus that no accounting to the root mem cgroup needs to be done.

> Michal acked the implementation, but on the condition that the new
> behavior be opt-in, to not surprise existing users. I *think* we agree
> that respecting the cgroup topography during global OOM is what we
> should have been doing when cgroups were initially introduced; where
> we disagree is that I think users shouldn't have to opt in to
> improvements. We have done much more invasive changes to the victim
> selection without actual regressions in the past. Further, this change
> only applies to mounts of the new cgroup2. Tejun also wasn't convinced
> of the risk for regression, and too would prefer cgroup-awareness to
> be the default in cgroup2. I would ask for patch 5/6 to be dropped.
> 

I agree with Michal that the new victim selection should be opt-in with 
CGRP_GROUP_OOM.
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux