Inspired by unshare, enter is a simple wrapper around setns that allows running a new process in the context of an existing process. Full paths may be specified to the namespace arguments so that namespace file descriptors may be used wherever they reside in the filesystem. Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> --- While doing a final check on this patch I just realized I am a week or two late to the discussion. So much for waiting until my code had merged into the kernel before submitting patches. I have been developing enter off and on as I have been developing these patches and it seems to be stable and feature complete at this point. I really don't like the the idea of adding setns support into unshare. Creating new namespaces and using existing namespaces are related but different concepts and place different demands on the evolotion of the code. Especially when the pid and user namespaces come into play. Little things like retaining the the ability for unshare to be suid root safely and sanely become intractable if you call setns() and join a user namespace. Supporting the ability for the command to be setuid root does not work in combination with the user namespace. As after entering the user namespace you can not reliably change your uid back to your uid without setuid as your uid may not be mapped. When joining an existing mount namespace you most likely want to change your root directory and your working directory to the directory of the process whoose mount namespace you are entering. Something you don't even think about when just unsharing a mount namespace. Then there is the practical wish to call fork after entering a pid namespace and before launching a command. You don't always want that but almost always so that the command will actually be run in the new pid namespace with a new pid, instead of having it's children in the new pid namespace. I really can't see support for using setns being in the same binary as unshare that just mixes two different but closely related things that will want to evolve in different directions. My inclination is to send a follow up patch to remove setns and migrate from unshare. And a second patch to add pid and user namespace support to unshare. But since I am going against the way that seems to have already been decided I will hold off on those patches until after we there is agreement on this one. Eric configure.ac | 11 ++ sys-utils/Makemodule.am | 7 + sys-utils/enter.1 | 101 +++++++++++++++++ sys-utils/enter.c | 286 +++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 405 insertions(+), 0 deletions(-) create mode 100644 sys-utils/enter.1 create mode 100644 sys-utils/enter.c diff --git a/configure.ac b/configure.ac index e937736..b0c9c6f 100644 --- a/configure.ac +++ b/configure.ac @@ -867,6 +867,17 @@ if test "x$build_unshare" = xyes; then AC_CHECK_FUNCS([unshare]) fi +AC_ARG_ENABLE([enter], + AS_HELP_STRING([--disable-enter], [do not build enter]), + [], enable_enter=check +) +UL_BUILD_INIT([enter]) +UL_REQUIRES_LINUX([enter]) +UL_REQUIRES_SYSCALL_CHECK([setns], [UL_CHECK_SYSCALL([setns])]) +AM_CONDITIONAL(BUILD_ENTER, test "x$build_enter" = xyes) +if test "x$build_enter" = xyes; then + AC_CHECK_FUNCS([setns]) +fi AC_ARG_ENABLE([arch], AS_HELP_STRING([--enable-arch], [do build arch]), diff --git a/sys-utils/Makemodule.am b/sys-utils/Makemodule.am index 5636f70..6ad09b2 100644 --- a/sys-utils/Makemodule.am +++ b/sys-utils/Makemodule.am @@ -290,6 +290,13 @@ unshare_SOURCES = sys-utils/unshare.c unshare_LDADD = $(LDADD) libcommon.la endif +if BUILD_UNSHARE +usrbin_exec_PROGRAMS += enter +dist_man_MANS += sys-utils/enter.1 +enter_SOURCES = sys-utils/enter.c +enter_LDADD = $(LDADD) libcommon.la +endif + if BUILD_ARCH bin_PROGRAMS += arch dist_man_MANS += sys-utils/arch.1 diff --git a/sys-utils/enter.1 b/sys-utils/enter.1 new file mode 100644 index 0000000..0829ee2 --- /dev/null +++ b/sys-utils/enter.1 @@ -0,0 +1,101 @@ +.TH ENTER 1 "January 2013" "util-linux" "User Commands" +.SH NAME +enter \- run program with namespaces of other processes +.SH SYNOPSIS +.B enter +.RI [ options ] +program +.RI [ arguments ] +.SH DESCRIPTION +Enters the contexts of one or more other processes and then executes specified +program. Enterable namespaces are: +.TP +.BR "mount namespace" +mounting and unmounting filesystems will not affect rest of the system +(\fBCLONE_NEWNS\fP flag), except for filesystems which are explicitly marked as +shared (by mount --make-shared). See /proc/self/mountinfo for the shared flags. +.TP +.BR "UTS namespace" +setting hostname, domainname will not affect rest of the system +(\fBCLONE_NEWUTS\fP flag). +.TP +.BR "IPC namespace" +process will have independent namespace for System V message queues, semaphore +sets and shared memory segments (\fBCLONE_NEWIPC\fP flag). +.TP +.BR "network namespace" +process will have independent IPv4 and IPv6 stacks, IP routing tables, firewall +rules, the \fI/proc/net\fP and \fI/sys/class/net\fP directory trees, sockets +etc. (\fBCLONE_NEWNET\fP flag). +.TP +.BR "pid namespace" +children will have a distinct set of pid to process mappings thantheir parent. +(\fBCLONE_NEWPID\fP flag). +.TP +.BR "user namespace" +process will have distinct set of uids, gids and capabilities. (\fBCLONE_NEWUSER\fP flag). +.TP +See the \fBclone\fR(2) for exact semantics of the flags. +.SH OPTIONS +.TP +.BR \-h , " \-\-help" +Print a help message, +.TP +.BR \-t , " \-\-target " \fIpid\fP +Specify a target process to get contexts from. +.TP +.BR \-m , " \-\-mount"=[\fIfile\fP] +Enter the mount namespace. +If no file is specified enter the mount namespace of the target process. +If file is specified enter the mount namespace specified by file. +.TP +.BR \-u , " \-\-uts"=[\fIfile\fP] +Enter the uts namespace. +If no file is specified enter the uts namespace of the target process. +If file is specified enter the uts namespace specified by file. +.TP +.BR \-i , " \-\-ipc "=[\fIfile\fP] +Enter the IPC namespace. +If no file is specified enter the IPC namespace of the target process. +If file is specified enter the uts namespace specified by file. +.TP +.BR \-n , " \-\-net"=[\fIfile\fP] +Enter the network namespace. +If no file is specified enter the network namespace of the target process. +If file is specified enter the network namespace specified by file. +.TP +.BR \-p , " \-\-pid"=[\fIfile\fP] +Enter the pid namespace. +If no file is specified enter the pid namespace of the target process. +If file is specified enter the pid namespace specified by file. +.TP +.BR \-U , " \-\-user"=[\fIfile\fP] +Enter the user namespace. +If no file is specified enter the user namespace of the target process. +If file is specified enter the user namespace specified by file. +.TP +.BR \-r , " \-\-root"=[\fIdirectory\fP] +Set the root directory. +If no directory is specified set the root directory to the root directory of the target process. +If directory is specified set the root directory to the specified directory. +.TP +.BR \-w , " \-\-wd"=[\fIdirectory\fP] +Set the working directory. +If no directory is specified set the working directory to the working directory of the target process. +If directory is specified set the working directory to the specified directory. +.TP +.BR \-e , " \-\-exec" +Don't fork before exec'ing the specified program. By default when entering +a pid namespace enter calls fork before calling exec so that the children will +be in the newly entered pid namespace. +.SH NOTES +.SH SEE ALSO +.BR setns (2), +.BR clone (2) +.SH BUGS +None known so far. +.SH AUTHOR +Eric Biederman <ebiederm@xxxxxxxxxxxx> +.SH AVAILABILITY +The enter command is part of the util-linux package and is available from +ftp://ftp.kernel.org/pub/linux/utils/util-linux/. diff --git a/sys-utils/enter.c b/sys-utils/enter.c new file mode 100644 index 0000000..d7bd540 --- /dev/null +++ b/sys-utils/enter.c @@ -0,0 +1,286 @@ +/* + * enter(1) - command-line interface for setns(2) + * + * Copyright (C) 2012-2013 Eric Biederman <ebiederm@xxxxxxxxxxxx> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; version 2. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +#include <sys/types.h> +#include <sys/wait.h> +#include <dirent.h> +#include <errno.h> +#include <getopt.h> +#include <sched.h> +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> + +#include "nls.h" +#include "c.h" +#include "closestream.h" + +#ifndef CLONE_NEWSNS +# define CLONE_NEWNS 0x00020000 +#endif +#ifndef CLONE_NEWUTS +# define CLONE_NEWUTS 0x04000000 +#endif +#ifndef CLONE_NEWIPC +# define CLONE_NEWIPC 0x08000000 +#endif +#ifndef CLONE_NEWNET +# define CLONE_NEWNET 0x40000000 +#endif +#ifndef CLONE_NEWUSER +# define CLONE_NEWUSER 0x10000000 +#endif +#ifndef CLONE_NEWPID +# define CLONE_NEWPID 0x20000000 +#endif + +#ifndef HAVE_SETNS +# include <sys/syscall.h> +static int setns(int fd, int nstype) +{ + return syscall(SYS_setns, fd, nstype); +} +#endif /* HAVE_SETNS */ + +static struct namespace_file{ + int nstype; + char *name; + int fd; +} namespace_files[] = { + /* Careful the order is signifcant in this array. + * + * The user namespace comes first, so that it is entered + * first. This gives an unprivileged user the potential to + * enter the other namespaces. + */ + { .nstype = CLONE_NEWUSER, .name = "ns/user", .fd = -1 }, + { .nstype = CLONE_NEWIPC, .name = "ns/ipc", .fd = -1 }, + { .nstype = CLONE_NEWUTS, .name = "ns/uts", .fd = -1 }, + { .nstype = CLONE_NEWNET, .name = "ns/net", .fd = -1 }, + { .nstype = CLONE_NEWPID, .name = "ns/pid", .fd = -1 }, + { .nstype = CLONE_NEWNS, .name = "ns/mnt", .fd = -1 }, + {} +}; + +static void usage(int status) +{ + FILE *out = status == EXIT_SUCCESS ? stdout : stderr; + + fputs(USAGE_HEADER, out); + fprintf(out, _(" %s [options] <program> [args...]\n"), + program_invocation_short_name); + + fputs(USAGE_OPTIONS, out); + fputs(_(" -t, --target <pid> target process to get namespaces from\n" + " -m, --mount [<file>] enter mount namespace\n" + " -u, --uts [<file>] enter UTS namespace (hostname etc)\n" + " -i, --ipc [<file>] enter System V IPC namespace\n" + " -n, --net [<file>] enter network namespace\n" + " -p, --pid [<file>] enter pid namespace\n" + " -U, --user [<file>] enter user namespace\n" + " -e, --exec don't fork before exec'ing <program>\n" + " -r, --root [<dir>] set the root directory\n" + " -w, --wd [<dir>] set the working directory\n"), out); + fputs(USAGE_SEPARATOR, out); + fputs(USAGE_HELP, out); + fputs(USAGE_VERSION, out); + fprintf(out, USAGE_MAN_TAIL("enter(1)")); + + exit(status); +} + +static pid_t namespace_target_pid = 0; +static int root_fd = -1; +static int wd_fd = -1; + +static void open_target_fd(int *fd, const char *type, char *path) +{ + char pathbuf[PATH_MAX]; + + if (!path && namespace_target_pid) { + snprintf(pathbuf, sizeof(pathbuf), "/proc/%u/%s", + namespace_target_pid, type); + path = pathbuf; + } + if (!path) + err(EXIT_FAILURE, _("No filename and no target pid supplied for %s"), + type); + + if (*fd >= 0) + close(*fd); + + *fd = open(path, O_RDONLY); + if (*fd < 0) + err(EXIT_FAILURE, _("open of '%s' failed"), path); +} + +static void open_namespace_fd(int nstype, char *path) +{ + struct namespace_file *nsfile; + + for (nsfile = namespace_files; nsfile->nstype; nsfile++) { + if (nstype != nsfile->nstype) + continue; + + open_target_fd(&nsfile->fd, nsfile->name, path); + return; + } + /* This should never happen */ + err(EXIT_FAILURE, "Unrecognized namespace type"); +} + +int main(int argc, char *argv[]) +{ + static const struct option longopts[] = { + { "help", no_argument, NULL, 'h' }, + { "version", no_argument, NULL, 'V'}, + { "target", required_argument, NULL, 't' }, + { "mount", optional_argument, NULL, 'm' }, + { "uts", optional_argument, NULL, 'u' }, + { "ipc", optional_argument, NULL, 'i' }, + { "net", optional_argument, NULL, 'n' }, + { "pid", optional_argument, NULL, 'p' }, + { "user", optional_argument, NULL, 'U' }, + { "exec", no_argument, NULL, 'e' }, + { "root", optional_argument, NULL, 'r' }, + { "wd", optional_argument, NULL, 'w' }, + { NULL, 0, NULL, 0 } + }; + + struct namespace_file *nsfile; + int do_fork = 0; + char *end; + int c; + + setlocale(LC_MESSAGES, ""); + bindtextdomain(PACKAGE, LOCALEDIR); + textdomain(PACKAGE); + atexit(close_stdout); + + while((c = getopt_long(argc, argv, "hVt:m::u::i::n::p::U::er::w::", longopts, NULL)) != -1) { + switch(c) { + case 'h': + usage(EXIT_SUCCESS); + case 'V': + printf(UTIL_LINUX_VERSION); + return EXIT_SUCCESS; + case 't': + errno = 0; + namespace_target_pid = strtoul(optarg, &end, 10); + if (!*optarg || (*optarg && *end) || errno != 0) { + err(EXIT_FAILURE, + _("Pid '%s' is not a valid number"), + optarg); + } + break; + case 'm': + open_namespace_fd(CLONE_NEWNS, optarg); + break; + case 'u': + open_namespace_fd(CLONE_NEWUTS, optarg); + break; + case 'i': + open_namespace_fd(CLONE_NEWIPC, optarg); + break; + case 'n': + open_namespace_fd(CLONE_NEWNET, optarg); + break; + case 'p': + do_fork = 1; + open_namespace_fd(CLONE_NEWPID, optarg); + break; + case 'U': + open_namespace_fd(CLONE_NEWUSER, optarg); + break; + case 'e': + do_fork = 0; + break; + case 'r': + open_target_fd(&root_fd, "root", optarg); + break; + case 'w': + open_target_fd(&wd_fd, "cwd", optarg); + break; + default: + usage(EXIT_FAILURE); + } + } + + if(optind >= argc) + usage(EXIT_FAILURE); + + /* + * Now that we know which namespaces we want to enter, enter them. + */ + for (nsfile = namespace_files; nsfile->nstype; nsfile++) { + if (nsfile->fd < 0) + continue; + if (setns(nsfile->fd, nsfile->nstype)) + err(EXIT_FAILURE, _("setns of '%s' failed"), + nsfile->name); + close(nsfile->fd); + nsfile->fd = -1; + } + + /* Remember the current working directory if I'm not changing it */ + if (root_fd >= 0 && wd_fd < 0) { + wd_fd = open(".", O_RDONLY); + if (wd_fd < 0) + err(EXIT_FAILURE, _("open of . failed")); + } + + /* Change the root directory */ + if (root_fd >= 0) { + if (fchdir(root_fd) < 0) + err(EXIT_FAILURE, _("fchdir to root_fd failed")); + + if (chroot(".") < 0) + err(EXIT_FAILURE, _("chroot failed")); + + close(root_fd); + root_fd = -1; + } + + /* Change the working directory */ + if (wd_fd >= 0) { + if (fchdir(wd_fd) < 0) + err(EXIT_FAILURE, _("fchdir to wd_fd failed")); + + close(wd_fd); + wd_fd = -1; + } + + if (do_fork) { + pid_t child = fork(); + if (child < 0) + err(EXIT_FAILURE, _("fork failed")); + if (child != 0) { + int status; + if ((waitpid(child, &status, 0) == child) && + WIFEXITED(status)) { + exit(WEXITSTATUS(status)); + } + exit(EXIT_FAILURE); + } + } + + execvp(argv[optind], argv + optind); + + err(EXIT_FAILURE, _("exec %s failed"), argv[optind]); +} -- 1.7.5.4 -- To unsubscribe from this list: send the line "unsubscribe util-linux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html