On Thu, Feb 13, 2020 at 12:41:36PM -0500, Johannes Weiner wrote: > On Thu, Feb 13, 2020 at 04:46:27PM +0100, Michal Hocko wrote: > > On Thu 13-02-20 08:23:17, Johannes Weiner wrote: > > > On Thu, Feb 13, 2020 at 08:40:49AM +0100, Michal Hocko wrote: > > > > On Wed 12-02-20 12:08:26, Johannes Weiner wrote: > > > > > On Tue, Feb 11, 2020 at 05:47:53PM +0100, Michal Hocko wrote: > > > > > > Unless I am missing something then I am afraid it doesn't. Say you have a > > > > > > default systemd cgroup deployment (aka deeper cgroup hierarchy with > > > > > > slices and scopes) and now you want to grant a reclaim protection on a > > > > > > leaf cgroup (or even a whole slice that is not really important). All the > > > > > > hierarchy up the tree has the protection set to 0 by default, right? You > > > > > > simply cannot get that protection. You would need to configure the > > > > > > protection up the hierarchy and that is really cumbersome. > > > > > > > > > > Okay, I think I know what you mean. Let's say you have a tree like > > > > > this: > > > > > > > > > > A > > > > > / \ > > > > > B1 B2 > > > > > / \ \ > > > > > C1 C2 C3 > > > > So let's see how that works in practice, say a multi workload setup > > > > with a complex/deep cgroup hierachies (e.g. your above example). No > > > > delegation point this time. > > > > > > > > C1 asks for low=1G while using 500M, C3 low=100M using 80M. B1 and > > > > B2 are completely independent workloads and the same applies to C2 which > > > > doesn't ask for any protection at all? C2 uses 100M. Now the admin has > > > > to propagate protection upwards so B1 low=1G, B2 low=100M and A low=1G, > > > > right? Let's say we have a global reclaim due to external pressure that > > > > originates from outside of A hierarchy (it is not overcommited on the > > > > protection). > > > > > > > > Unless I miss something C2 would get a protection even though nobody > > > > asked for it. > > > > > > Good observation, but I think you spotted an unintentional side effect > > > of how I implemented the "floating protection" calculation rather than > > > a design problem. > > > > > > My patch still allows explicit downward propagation. So if B1 sets up > > > 1G, and C1 explicitly claims those 1G (low>=1G, usage>=1G), C2 does > > > NOT get any protection. There is no "floating" protection left in B1 > > > that could get to C2. > > > > Yeah, the saturated protection works reasonably AFAICS. > > Hm, Tejun raises a good point though: even if you could direct memory > protection down to one targeted leaf, you can't do the same with IO or > CPU. Those follow non-conserving weight distribution, and whatever you "work-conserving", obviously. > allocate to a certain level is available at that level - if one child > doesn't consume it, the other children can. > > And we know that controlling memory without controlling IO doesn't > really work in practice. The sibling with less memory allowance will > just page more. > > So the question becomes: is this even a legit usecase? If every other > resource is distributed on a level-by-level method already, does it > buy us anything to make memory work differently?