Hello Michal, On Fri, Feb 21, 2020 at 06:10:24PM +0100, Michal Koutný wrote: > On Thu, Dec 19, 2019 at 03:07:17PM -0500, Johannes Weiner <hannes@xxxxxxxxxxx> wrote: > > + * Consider the following example tree: > > * > > + * A A/memory.low = 2G, A/memory.current = 6G > > + * //\\ > > + * BC DE B/memory.low = 3G B/memory.current = 2G > > + * C/memory.low = 1G C/memory.current = 2G > > + * D/memory.low = 0 D/memory.current = 2G > > + * E/memory.low = 10G E/memory.current = 0 > > * > > + * and memory pressure is applied, the following memory > > + * distribution is expected (approximately*): > > * > > + * A/memory.current = 2G > > + * B/memory.current = 1.3G > > + * C/memory.current = 0.6G > > + * D/memory.current = 0 > > + * E/memory.current = 0 > > * > > + * *assuming equal allocation rate and reclaimability > I think the assumptions for this example don't hold (anymore). > Because reclaim rate depends on the usage above protection, the siblings > won't be reclaimed equally and so the low_usage proportionality will > change over time and the equilibrium distribution is IMO different (I'm > attaching an Octave script to calculate it). Hm, this example doesn't change with my patch because there is no "floating" protection that gets distributed among the siblings. In my testing with the above parameters, the equilibrium still comes out to roughly this distribution. > As it depends on the initial usage, I don't think there can be given > such a general example (for overcommit). It's just to illustrate the pressure weight, not to reflect each factor that can influence the equilibrium. I think it still has value to gain understanding of how it works, no? > > @@ -6272,12 +6262,63 @@ struct cgroup_subsys memory_cgrp_subsys = { > > * for next usage. This part is intentionally racy, but it's ok, > > * as memory.low is a best-effort mechanism. > Although it's a different issue but since this updates the docs I'm > mentioning it -- we treat memory.min the same, i.e. it's subject to the > same race, however, it's not meant to be best effort. I didn't look into > outcomes of potential misaccounting but the comment seems to miss impact > on memory.min protection. Yeah I think we can delete that bit. > > @@ -6292,52 +6333,29 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, > > [...] > > + if (parent == root) { > > + memcg->memory.emin = memcg->memory.min; > > + memcg->memory.elow = memcg->memory.low; > > + goto out; > > } > Shouldn't this condition be 'if (parent == root_mem_cgroup)'? (I.e. 1st > level takes direct input, but 2nd and further levels redistribute only > what they really got from parent.) I believe we cleared this up in the parallel thread, but just in case: reclaim can happen due to a memory.max set lower in the tree. memory.low propagation is always relative from the reclaim scope, not the system-wide root cgroup.