Hi, memory cgroups are fundamentally broken when it comes to partitioning the machine for many concurrent jobs. In real life, workloads expand and contract over time, and the hard limit is too static to reflect this - it either wastes memory outside of the group, or wastes memory inside the group. As a result, the hard limit is mostly just used to catch extreme consumption peaks, while workload trimming and balancing is left to global reclaim and global OOM handling. That in turn requires more and more cgroup-awareness on the global level to make up for the lack of useful policy enforcement on the cgroup level itself. The ongoing versioning of the cgroup user interface gives us a chance to fix such brokenness, and also clean up the interface and fix a lot of the inconsistencies and ugliness that crept in over time. This series adds a minimal set of control files to version 2 of the memcg interface, implementing a new approach to machine partitioning. Version 2 of this series is in response to feedback from Michal. Some of the changes are in code, but mostly it improves the documentation and changelogs to describe the fundamental problems with the original approach to machine partitioning and makes a case for the new model. Documentation/cgroups/unified-hierarchy.txt | 65 ++++++++ include/linux/res_counter.h | 29 ++++ include/linux/swap.h | 6 +- kernel/res_counter.c | 3 + mm/memcontrol.c | 250 +++++++++++++++++++--------- mm/vmscan.c | 7 +- 6 files changed, 277 insertions(+), 83 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>