The patch titled Subject: memcg, kmem: deprecate kmem.limit_in_bytes has been added to the -mm tree. Its filename is memcg-kmem-deprecate-kmemlimit_in_bytes.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/memcg-kmem-deprecate-kmemlimit_in_bytes.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/memcg-kmem-deprecate-kmemlimit_in_bytes.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Michal Hocko <mhocko@xxxxxxxx> Subject: memcg, kmem: deprecate kmem.limit_in_bytes Cgroup v1 memcg controller has exposed a dedicated kmem limit to users which turned out to be really a bad idea because there are paths which cannot shrink the kernel memory usage enough to get below the limit (e.g. because the accounted memory is not reclaimable). There are cases when the failure is even not allowed (e.g. __GFP_NOFAIL). This means that the kmem limit is in excess to the hard limit without any way to shrink and thus completely useless. OOM killer cannot be invoked to handle the situation because that would lead to a premature oom killing. As a result many places might see ENOMEM returning from kmalloc and result in unexpected errors. E.g. a global OOM killer when there is a lot of free memory because ENOMEM is translated into VM_FAULT_OOM in #PF path and therefore pagefault_out_of_memory would result in OOM killer. Please note that the kernel memory is still accounted to the overall limit along with the user memory so removing the kmem specific limit should still allow to contain kernel memory consumption. Unlike the kmem one, though, it invokes memory reclaim and targeted memcg oom killing if necessary. Start the deprecation process by crying to the kernel log. Let's see whether there are relevant usecases and simply return to EINVAL in the second stage if nobody complains in few releases. Link: http://lkml.kernel.org/r/20190911151612.GI4023@xxxxxxxxxxxxxx Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Cc: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx> Cc: Thomas Lindroth <thomas.lindroth@xxxxxxxxx> Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- Documentation/admin-guide/cgroup-v1/memory.rst | 3 +++ mm/memcontrol.c | 3 +++ 2 files changed, 6 insertions(+) --- a/Documentation/admin-guide/cgroup-v1/memory.rst~memcg-kmem-deprecate-kmemlimit_in_bytes +++ a/Documentation/admin-guide/cgroup-v1/memory.rst @@ -87,6 +87,9 @@ Brief summary of control files. node memory.kmem.limit_in_bytes set/show hard limit for kernel memory + This knob is deprecated it shouldn't be + used. It is planned to be removed in + a foreseeable future. memory.kmem.usage_in_bytes show current kernel memory allocation memory.kmem.failcnt show the number of kernel memory usage hits limits --- a/mm/memcontrol.c~memcg-kmem-deprecate-kmemlimit_in_bytes +++ a/mm/memcontrol.c @@ -3647,6 +3647,9 @@ static ssize_t mem_cgroup_write(struct k ret = mem_cgroup_resize_max(memcg, nr_pages, true); break; case _KMEM: + pr_warn_once("kmem.limit_in_bytes is deprecated and will be removed. " + "Please report your usecase to linux-mm@xxxxxxxxx if you " + "depend on this functionality.\n"); ret = memcg_update_kmem_max(memcg, nr_pages); break; case _TCP: _ Patches currently in -mm which might be from mhocko@xxxxxxxx are memcg-kmem-do-not-fail-__gfp_nofail-charges.patch mm-oom-consider-present-pages-for-the-node-size.patch memcg-kmem-deprecate-kmemlimit_in_bytes.patch