On Tue, Jul 04, 2023 at 01:52:40PM +0200, Michal Hocko wrote: > From: Michal Hocko <mhocko@xxxxxxxx> > > kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been > deprecated since 58056f77502f ("memcg, kmem: further deprecate > kmem.limit_in_bytes") merged in 5.16. We haven't heard about any > serious users since then but it seems that the mere presence of the file > is causing more harm thatn good. We (SUSE) have had several bug reports > from customers where Docker based containers started to fail because a > write to kmem.limit_in_bytes has failed. > > This was unexpected because runc code only expects ENOENT (kmem > disabled) or EBUSY (tasks already running within cgroup). So a new error > code was unexpected and the whole container startup failed. This has > been later addressed by > https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380 > so current Docker runtimes do not suffer from the problem anymore. There > are still older version of Docker in use and likely hard to get rid of > completely. > > Address this by wiping out the file completely and effectively get back > to pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration. > > I would recommend backporting to stable trees which have picked up > 58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes"). > > Cc: stable > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > --- > Documentation/admin-guide/cgroup-v1/memory.rst | 2 -- > mm/memcontrol.c | 13 ------------- > 2 files changed, 15 deletions(-) > > diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst > index 47d1d7d932a8..b92c71f39172 100644 > --- a/Documentation/admin-guide/cgroup-v1/memory.rst > +++ b/Documentation/admin-guide/cgroup-v1/memory.rst > @@ -92,8 +92,6 @@ Brief summary of control files. > memory.oom_control set/show oom controls. > memory.numa_stat show the number of memory usage per numa > node > - memory.kmem.limit_in_bytes This knob is deprecated and writing to > - it will return -ENOTSUPP. > memory.kmem.usage_in_bytes show current kernel memory allocation > memory.kmem.failcnt show the number of kernel memory usage > hits limits > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 4b27e245a055..a0d3ed8d02e2 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3750,9 +3750,6 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css, > case _MEMSWAP: > counter = &memcg->memsw; > break; > - case _KMEM: > - counter = &memcg->kmem; > - break; > case _TCP: > counter = &memcg->tcpmem; > break; This case is still needed for the remaining kmem files: { .name = "kmem.usage_in_bytes", .private = MEMFILE_PRIVATE(_KMEM, RES_USAGE), .read_u64 = mem_cgroup_read_u64, }, { .name = "kmem.failcnt", .private = MEMFILE_PRIVATE(_KMEM, RES_FAILCNT), .write = mem_cgroup_reset, .read_u64 = mem_cgroup_read_u64, }, { .name = "kmem.max_usage_in_bytes", .private = MEMFILE_PRIVATE(_KMEM, RES_MAX_USAGE), .write = mem_cgroup_reset, .read_u64 = mem_cgroup_read_u64, }, otherwise they BUG() when reading. Without this hunk, the patch looks good to me.