On Thu 21-05-20 14:41:47, Chris Down wrote: > Michal Hocko writes: > > On Thu 21-05-20 13:57:59, Chris Down wrote: [...] > > > If you're talking about reclaim, trying to reason about whether the overage > > > is the result of some other task in this cgroup or the task that's > > > allocating right now is something that we already know doesn't work well > > > (eg. global OOM). > > > > I am not sure I follow you here. > > Let me rephrase: your statement is that it's not desirable "that some task > would be throttled unexpectedly too long because of [the activity of another > task also within that cgroup]" (let me know if that's not what you meant). > But trying to avoid that requires knowing which activity abstractly > instigates this entire mess in the first place, which we have nowhere near > enough context to determine. Yeah, if we want to be really precise then you are right, nothing like that is really feasible for the reclaim. Reclaiming 1 page might be much more expensive than 100 pages because LRU order doesn't reflect the cost of the reclaim at all. What, I believe, we want is a best effort, really. If the reclaim target is somehow bound to the requested amount of memory then we can at least say that more memory hungry consumers are reclaiming more. Which is something people can wrap their head around much easier than a free competition on the reclaim with some hard to predict losers who do all the work and some lucky ones which just happen to avoid throttling by a better timing. Really think of the direct reclaim and how the unfairness suck there. -- Michal Hocko SUSE Labs