The patch titled Subject: kernel/sys: optimize do_prlimit lock scope to reduce contention has been added to the -mm mm-nonmm-unstable branch. Its filename is kernel-sys-optimize-do_prlimit-lock-scope-to-reduce-contention.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/kernel-sys-optimize-do_prlimit-lock-scope-to-reduce-contention.patch This patch will later appear in the mm-nonmm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Zhen Ni <zhen.ni@xxxxxxxxxxxx> Subject: kernel/sys: optimize do_prlimit lock scope to reduce contention Date: Wed, 20 Nov 2024 21:21:56 +0800 Refine the lock scope in the do_prlimit function to reduce contention on task_lock(tsk->group_leader). The lock now protects only sections that access or modify shared resources (rlim). Permission checks (capable) and security validations (security_task_setrlimit) are placed outside the lock, as they do not modify rlim and are independent of shared data protection. security_task_setrlimit() is a Linux Security Module (LSM) hook that evaluates resource limit changes based on security policies. It does not alter the rlim data structure, as confirmed by existing LSM implementations (e.g., SELinux and AppArmor). Thus, this function does not require locking, ensuring correctness while improving concurrency. Link: https://lkml.kernel.org/r/20241120132156.207250-1-zhen.ni@xxxxxxxxxxxx Signed-off-by: Zhen Ni <zhen.ni@xxxxxxxxxxxx> Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx> Cc: Catalin Marinas <catalin.marinas@xxxxxxx> Cc: Christian Brauner <brauner@xxxxxxxxxx> Cc: Oleg Nesterov <oleg@xxxxxxxxxx> Cc: Zev Weiss <zev@xxxxxxxxxxxxxxxxx> Cc: James Morris <jmorris@xxxxxxxxx> Cc: Paul Moore <paul@xxxxxxxxxxxxxx> Cc: Serge E. Hallyn <serge@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- kernel/sys.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) --- a/kernel/sys.c~kernel-sys-optimize-do_prlimit-lock-scope-to-reduce-contention +++ a/kernel/sys.c @@ -1481,18 +1481,20 @@ static int do_prlimit(struct task_struct /* Holding a refcount on tsk protects tsk->signal from disappearing. */ rlim = tsk->signal->rlim + resource; - task_lock(tsk->group_leader); if (new_rlim) { /* * Keep the capable check against init_user_ns until cgroups can * contain all limits. */ if (new_rlim->rlim_max > rlim->rlim_max && - !capable(CAP_SYS_RESOURCE)) - retval = -EPERM; - if (!retval) - retval = security_task_setrlimit(tsk, resource, new_rlim); + !capable(CAP_SYS_RESOURCE)) + return -EPERM; + retval = security_task_setrlimit(tsk, resource, new_rlim); + if (retval) + return retval; } + + task_lock(tsk->group_leader); if (!retval) { if (old_rlim) *old_rlim = *rlim; _ Patches currently in -mm which might be from zhen.ni@xxxxxxxxxxxx are kernel-sys-optimize-do_prlimit-lock-scope-to-reduce-contention.patch