On Fri, Mar 11, 2016 at 09:42:39AM +0100, Michal Hocko wrote: > On Fri 11-03-16 11:34:40, Vladimir Davydov wrote: > > On Thu, Mar 10, 2016 at 03:50:13PM -0500, Johannes Weiner wrote: > > > When setting memory.high below usage, nothing happens until the next > > > charge comes along, and then it will only reclaim its own charge and > > > not the now potentially huge excess of the new memory.high. This can > > > cause groups to stay in excess of their memory.high indefinitely. > > > > > > To fix that, when shrinking memory.high, kick off a reclaim cycle that > > > goes after the delta. > > > > I agree that we should reclaim the high excess, but I don't think it's a > > good idea to do it synchronously. Currently, memory.low and memory.high > > knobs can be easily used by a single-threaded load manager implemented > > in userspace, because it doesn't need to care about potential stalls > > caused by writes to these files. After this change it might happen that > > a write to memory.high would take long, seconds perhaps, so in order to > > react quickly to changes in other cgroups, a load manager would have to > > spawn a thread per each write to memory.high, which would complicate its > > implementation significantly. > > Is the complication on the managing part really an issue though. Such a > manager would have to spawn a process/thread to change the .max already. IMO memory.max is not something that has to be changed often. In most cases it will be set on container start and stay put throughout container lifetime. I can also imagine a case when memory.max will be changed for all containers when a container starts or stops, so as to guarantee that if <= N containers of M go mad, the system will survive. In any case, memory.max is reconfigured rarely, it rather belongs to the static configuration. OTOH memory.low and memory.high are perfect to be changed dynamically, basing on containers' memory demand/pressure. A load manager might want to reconfigure these knobs say every 5 seconds. Spawning a thread per each container that often would look unnecessarily overcomplicated IMO. Thanks, Vladimir -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html