From: Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx> sysctl.c has its own custom uid check, which is not user namespace aware. As discovered by Richard, that allows root in a container privileged access to set all sysctls. To fix that, don't compare uid or groups if current is not in the initial user namespace. We may at some point want to relax that check so that some sysctls are allowed - for instance dmesg_restrict when syslog is containerized. Changelog: Sep 22: As Miquel van Smoorenburg pointed out, rather than always refusing access if not in initial user_ns, we should allow world access rights to sysctl files. We just want to prevent a task in a non-init user namespace from getting the root user or group access rights. Signed-off-by: Serge Hallyn <serge.hallyn@xxxxxxxxxxxxx> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> Cc: Vasiliy Kulikov <segoon@xxxxxxxxxxxx> Cc: richard@xxxxxx Cc: Miquel van Smoorenburg <mikevs@xxxxxxxxxx> --- kernel/sysctl.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/kernel/sysctl.c b/kernel/sysctl.c index ae27196..473df41 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1708,10 +1708,12 @@ void register_sysctl_root(struct ctl_table_root *root) static int test_perm(int mode, int op) { - if (!current_euid()) - mode >>= 6; - else if (in_egroup_p(0)) - mode >>= 3; + if (current_user_ns() == &init_user_ns) { + if (!current_euid()) + mode >>= 6; + else if (in_egroup_p(0)) + mode >>= 3; + } if ((op & ~mode & (MAY_READ|MAY_WRITE|MAY_EXEC)) == 0) return 0; return -EACCES; -- 1.7.0.4 _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers