From: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> Subject: mm/memcg: disable threshold event handlers on PREEMPT_RT During the integration of PREEMPT_RT support, the code flow around memcg_check_events() resulted in `twisted code'. Moving the code around and avoiding then would then lead to an additional local-irq-save section within memcg_check_events(). While looking better, it adds a local-irq-save section to code flow which is usually within an local-irq-off block on non-PREEMPT_RT configurations. The threshold event handler is a deprecated memcg v1 feature. Instead of trying to get it to work under PREEMPT_RT just disable it. There should be no users on PREEMPT_RT. From that perspective it makes even less sense to get it to work under PREEMPT_RT while having zero users. Make memory.soft_limit_in_bytes and cgroup.event_control return -EOPNOTSUPP on PREEMPT_RT. Make an empty memcg_check_events() and memcg_write_event_control() which return only -EOPNOTSUPP on PREEMPT_RT. Document that the two knobs are disabled on PREEMPT_RT. Link: https://lkml.kernel.org/r/20220226204144.1008339-3-bigeasy@xxxxxxxxxxxxx Suggested-by: Michal Hocko <mhocko@xxxxxxxxxx> Suggested-by: Michal Koutný <mkoutny@xxxxxxxx> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> Acked-by: Roman Gushchin <guro@xxxxxx> Acked-by: Johannes Weiner <hannes@xxxxxxxxxxx> Reviewed-by: Shakeel Butt <shakeelb@xxxxxxxxxx> Acked-by: Michal Hocko <mhocko@xxxxxxxx> Cc: kernel test robot <oliver.sang@xxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Cc: Waiman Long <longman@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- Documentation/admin-guide/cgroup-v1/memory.rst | 2 ++ mm/memcontrol.c | 14 ++++++++++++-- 2 files changed, 14 insertions(+), 2 deletions(-) --- a/Documentation/admin-guide/cgroup-v1/memory.rst~mm-memcg-disable-threshold-event-handlers-on-preempt_rt +++ a/Documentation/admin-guide/cgroup-v1/memory.rst @@ -64,6 +64,7 @@ Brief summary of control files. threads cgroup.procs show list of processes cgroup.event_control an interface for event_fd() + This knob is not available on CONFIG_PREEMPT_RT systems. memory.usage_in_bytes show current usage for memory (See 5.5 for details) memory.memsw.usage_in_bytes show current usage for memory+Swap @@ -75,6 +76,7 @@ Brief summary of control files. memory.max_usage_in_bytes show max memory usage recorded memory.memsw.max_usage_in_bytes show max memory+Swap usage recorded memory.soft_limit_in_bytes set/show soft limit of memory usage + This knob is not available on CONFIG_PREEMPT_RT systems. memory.stat show various statistics memory.use_hierarchy set/show hierarchical account enabled This knob is deprecated and shouldn't be --- a/mm/memcontrol.c~mm-memcg-disable-threshold-event-handlers-on-preempt_rt +++ a/mm/memcontrol.c @@ -858,6 +858,9 @@ static bool mem_cgroup_event_ratelimit(s */ static void memcg_check_events(struct mem_cgroup *memcg, int nid) { + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + return; + /* threshold event is triggered in finer grain than soft limit */ if (unlikely(mem_cgroup_event_ratelimit(memcg, MEM_CGROUP_TARGET_THRESH))) { @@ -3731,8 +3734,12 @@ static ssize_t mem_cgroup_write(struct k } break; case RES_SOFT_LIMIT: - memcg->soft_limit = nr_pages; - ret = 0; + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { + ret = -EOPNOTSUPP; + } else { + memcg->soft_limit = nr_pages; + ret = 0; + } break; } return ret ?: nbytes; @@ -4708,6 +4715,9 @@ static ssize_t memcg_write_event_control char *endp; int ret; + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + return -EOPNOTSUPP; + buf = strstrip(buf); efd = simple_strtoul(buf, &endp, 10); _