In [1] Michal Koutný <mkoutny@xxxxxxxx> reported that it does not make sense to unconditionally exempt the INIT_USER during fork and exec from RLIMIT_NPROC and then to impose a limit on that same user with is_ucounts_overlimit. So I looked into why the exeception was added. commit 909cc4ae86f3 ("[PATCH] Fix two bugs with process limits (RLIMIT_NPROC)") says: > If a setuid process swaps it's real and effective uids and then > forks, the fork fails if the new realuid has more processes than > the original process was limited to. This is particularly a > problem if a user with a process limit (e.g. 256) runs a > setuid-root program which does setuid() + fork() (e.g. lprng) while > root already has more than 256 process (which is quite possible). > > The root problem here is that a limit which should be a per-user > limit is being implemented as a per-process limit with per-process > (e.g. CAP_SYS_RESOURCE) controls. Being a per-user limit, it > should be that the root-user can over-ride it, not just some > process with CAP_SYS_RESOURCE. > > This patch adds a test to ignore process limits if the real user is root. With the moving of the limits from per-user to per-user-per-user_ns it is clear that testing a user_struct is no longer the proper test and the test should be a test against the ucounts struct to match the rest of the logic change that was made. With RLIMIT_NPROC not being enforced for the global root user anywhere else should it be enforced in is_ucounts_overlimit for a user namespace created by the global root user? Since this is limited to just the global root user, and RLIMIT_NPROC is already ignored for that user I am going to vote no. This change does imply that it becomes possible to limit all users in a user namespace but to not enforce the rlimits on the root user or anyone with CAP_SYS_ADMIN and CAP_SYS_RESOURCE in the user namespace. It is not clear to me why any of those exceptions exist so I figure we should until this is actually a problem for someone before we relax the permission checks here. Cc: stable@xxxxxxxxxxxxxxx Reported-by: Michal Koutný <mkoutny@xxxxxxxx> [1] https://lkml.kernel.org/r/20220207121800.5079-5-mkoutny@xxxxxxxx History-Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> --- fs/exec.c | 2 +- kernel/fork.c | 2 +- kernel/user_namespace.c | 2 ++ 3 files changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index 1e7f757cbc2c..01c8c7bae9b4 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1881,7 +1881,7 @@ static int do_execveat_common(int fd, struct filename *filename, */ if ((current->flags & PF_NPROC_CHECK) && is_ucounts_overlimit(current_ucounts(), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) && - (current_user() != INIT_USER) && + (current_ucounts() != &init_ucounts) && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) { retval = -EAGAIN; goto out_ret; diff --git a/kernel/fork.c b/kernel/fork.c index 2b6a28a86325..6f62d37f3650 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2027,7 +2027,7 @@ static __latent_entropy struct task_struct *copy_process( retval = -EAGAIN; if (is_ucounts_overlimit(task_ucounts(p), UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC))) { - if (p->real_cred->user != INIT_USER && + if ((task_ucounts(p) != &init_ucounts) && !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) goto bad_fork_cleanup_count; } diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c index 6b2e3ca7ee99..f0c04073403d 100644 --- a/kernel/user_namespace.c +++ b/kernel/user_namespace.c @@ -123,6 +123,8 @@ int create_user_ns(struct cred *new) ns->ucount_max[i] = INT_MAX; } set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)); + if (new->ucounts == &init_ucounts) + set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_NPROC, RLIMIT_INFINITY); set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_MSGQUEUE, rlimit(RLIMIT_MSGQUEUE)); set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_SIGPENDING, rlimit(RLIMIT_SIGPENDING)); set_rlimit_ucount_max(ns, UCOUNT_RLIMIT_MEMLOCK, rlimit(RLIMIT_MEMLOCK)); -- 2.29.2