Quoting Daniel P. Berrange (berrange@xxxxxxxxxx): > IIUC, currently, in order to be able to invoke setns() the calling > process is required to have CAP_SYS_ADMIN capability. Alternatively > with user namespaces, access is allowed if the process' effective > uid matches the uid of the owner of the namespace. > > I have a scenario though that isn't dealt with by those rules. I have > a container that was spawned by uid == 0, and I have an unprivileged > process uid != 0 in the initial namespace that I wish to allow to enter > the namespaces associated with the container. > > With the CAP_SYS_ADMIN rule on setns(), it seems that the unprivileged > process will need to execute some setuid binary in order to gain the > CAP_SYS_ADMIN capability required to call setns(). > > Of course you can't call setns() without first obtaining a file > descriptor corresponding to the /proc/$CONTAINERPID/ns/XXXX namespace > you wish to join. This appears to require elevated privileges too. > > Is it not sufficient to rely on the permissions on the /proc/$PID/ns/XXX > file to control access to a namespace, and thus allow setns() without > a CAP_SYS_ADMIN check ? ie setns() is basically useless unless you > already have sufficient privileges to get a file descriptor for the > namespace, so why does setns need an additional privilege check beyond > that done at time of open() on the proc file. > > In my scenario this would allow for a privileged daemon to open the > /proc/$PID/ns/XXX file, and then pass the file descriptor back to the > unprivileged process using SCM_RIGHTS, which could then invoke setns(). > This kind of privilege separation is perferrable to requiring the > unprivileged process to run some kind of setuid program to gain the > CAP_SYS_ADMIN capability. > > Regards, > Daniel Hey Daniel, it's possible that as things settle down we can find ways to relax this, but note that this behavior was very specifically added by the commit below. Very unfortunate that this is a feature regression over the CONFIG_USER_NS=n case :( 5e4a08476b50fa39210fca82e03325cc46b9c235 Author: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> Date: Fri Dec 14 07:55:36 2012 -0800 userns: Require CAP_SYS_ADMIN for most uses of setns. Andy Lutomirski <luto@xxxxxxxxxxxxxx> found a nasty little bug in the permissions of setns. With unprivileged user namespaces it became possible to create new namespaces without privilege. However the setns calls were relaxed to only require CAP_SYS_ADMIN in the user nameapce of the targed namespace. Which made the following nasty sequence possible. pid = clone(CLONE_NEWUSER | CLONE_NEWNS); if (pid == 0) { /* child */ system("mount --bind /home/me/passwd /etc/passwd"); } else if (pid != 0) { /* parent */ char path[PATH_MAX]; snprintf(path, sizeof(path), "/proc/%u/ns/mnt"); fd = open(path, O_RDONLY); setns(fd, 0); system("su -"); } Prevent this possibility by requiring CAP_SYS_ADMIN in the current user namespace when joing all but the user namespace. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers