On Thu, Nov 09, 2023 at 11:02:23AM +0100, Michal Hocko wrote: > On Wed 08-11-23 19:25:14, Gregory Price wrote: > > This patchset implements weighted interleave and adds a new cgroup > > sysfs entry: cgroup/memory.interleave_weights (excluded from root). > > Why have you chosen memory controler rather than cpuset controller? > TBH I do not think memcg is the best fit because traditionally memcg > accounts consumption rather than memory placement. This means that the > memory is already allocated when it is charged for a memcg. On the other > hand cpuset controller is the one to control the allocation placement so > it would seem a better fit. > -- > Michal Hocko > SUSE Labs Actually going to walk back my last email, memcg actually feels more correct than cpuset, if only because of what the admin-guide says: """ The "memory" controller regulates distribution of memory. [... snip ...] While not completely water-tight, all major memory usages by a given cgroup are tracked so that the total memory consumption can be accounted and controlled to a reasonable extent. """ 'And controlled to a reasonable extent' seems to fit the description of this mechanism better than the cpuset description: """ The "cpuset" controller provides a mechanism for constraining the CPU and memory node placement of tasks to only the resources specified in the cpuset interface files in a task's current cgroup. """ This is not a constraining interface... it's "more of a suggestion". In particular, anything not using interleave doesn't even care about these weights at all. The distribution is only enforced for allocation, it does not cause migrations... thought that would be a neat idea. This is explicitly why the interface does not allow a weight of 0 (the not should be omitted from the policy nodemask or cpuset instead). Even if this were designed to enforce a particular distribution of memory, I'm not certain that would belong in cpusets either - but I suppose that is a separate discussion. It's possible this array of weights could be used to do both, but it seems (at least on the surface) that making this a hard control is an excellent way to induce OOMs where you may not want them. Anyway, summarizing: After a bit of reading, this does seem to map better to the "accounting consumption" subsystem than the "constrain" subsystem. However, if you think it's better suited for cpuset, I'm happy to push in that direction. ~Gregory