We always deploy many containers on one host. Some of these containers are with high priority, while others are with low priority. memory.{min, low} is useful to help us protect page cache of a specified container to gain better performance. But currently it is only supported in cgroup v2. To support it in cgroup v1, we only need to make small changes, as the facility is already exist. This patch exposed two files to user in cgroup v1, which are memory.min and memory.low. The usage to set these two files is same with cgroup v2. Both hierarchical and non-hierarchical mode are supported. Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Cc: Shakeel Butt <shakeelb@xxxxxxxxxx> Cc: Yafang Shao <shaoyafang@xxxxxxxxxxxxxx> --- Documentation/cgroup-v1/memory.txt | 4 ++++ mm/memcontrol.c | 20 +++++++++++++++++++- 2 files changed, 23 insertions(+), 1 deletion(-) diff --git a/Documentation/cgroup-v1/memory.txt b/Documentation/cgroup-v1/memory.txt index a33cedf..7178247 100644 --- a/Documentation/cgroup-v1/memory.txt +++ b/Documentation/cgroup-v1/memory.txt @@ -63,6 +63,10 @@ Brief summary of control files. (See 5.5 for details) memory.limit_in_bytes # set/show limit of memory usage memory.memsw.limit_in_bytes # set/show limit of memory+Swap usage + memory.min # set/show hard memory protection + (See ../admin-guide/cgroup-v2.rst for details) + memory.low # set/show best-effort memory protection + (See ../admin-guide/cgroup-v2.rst for details) memory.failcnt # show the number of memory usage hits limits memory.memsw.failcnt # show the number of memory+Swap hits limits memory.max_usage_in_bytes # show max memory usage recorded diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 3ee806b..58dce75 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -169,6 +169,12 @@ struct mem_cgroup_event { static void mem_cgroup_threshold(struct mem_cgroup *memcg); static void mem_cgroup_oom_notify(struct mem_cgroup *memcg); +static int memory_min_show(struct seq_file *m, void *v); +static ssize_t memory_min_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); +static int memory_low_show(struct seq_file *m, void *v); +static ssize_t memory_low_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off); /* Stuffs for move charges at task migration. */ /* @@ -4288,6 +4294,18 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of, .read_u64 = mem_cgroup_read_u64, }, { + .name = "min", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = memory_min_show, + .write = memory_min_write, + }, + { + .name = "low", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = memory_low_show, + .write = memory_low_write, + }, + { .name = "failcnt", .private = MEMFILE_PRIVATE(_MEM, RES_FAILCNT), .write = mem_cgroup_reset, @@ -5925,7 +5943,7 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, parent = parent_mem_cgroup(memcg); /* No parent means a non-hierarchical mode on v1 memcg */ if (!parent) - return MEMCG_PROT_NONE; + goto exit; if (parent == root) goto exit; -- 1.8.3.1