Re: Bug: for mount namespaces inside a chroot, unshare works but nsenter doesn't

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 27, 2017 at 06:07:00PM +0000, Ximin Luo wrote:
> When unsharing persistent mount namespaces, unshare+nsenter does not seem to
> work properly when run from inside a chroot session. However, unshare by itself
> works.

It's not related to persistent namespace, but to the way how nsenter
uses chroot().

> As a workaround for the unshare+nsenter case, one can run `nsenter --mount=<ns>
> chroot <real/path/to/chroot> command args`. The `--root` option to `nsenter`
> sounds like it should work, but it does not - see below for details.
> 
> Is this a bug? 

It seems like nsenter logic problem.

The command nsenter opens root-dir and cwd file descriptors *before*
setns() syscall, and than *after* the syscall it calls chroot(). The
final process is in the namespace, but no in the root directory.

        open("/mnt/test/chroot/namespaces/mnt", O_RDONLY) = 3
        open("/mnt/test/chroot", O_RDONLY)      = 4
        open("/mnt/test/chroot", O_RDONLY)      = 5
        setns(3, CLONE_NEWNS)                   = 0
        close(3)                                = 0
        fchdir(4)                               = 0
        chroot(".")                             = 0
        close(4)                                = 0
        fchdir(5)                               = 0
        close(5)                                = 0
        execve("/bin/bash", ["-bash"], 0x7ffd2b5244d0 /* 31 vars */) = 0

The patch below fixes the issue. It just moves root-dir and cwd open
calls *after* the setns():

        open("/mnt/test/chroot/namespaces/mnt", O_RDONLY) = 3
        setns(3, CLONE_NEWNS)                   = 0
        close(3)                                = 0
        open("/mnt/test/chroot", O_RDONLY)      = 3
        open("/mnt/test/chroot", O_RDONLY)      = 4
        fchdir(4)                               = 0
        chroot(".")                             = 0
        close(4)                                = 0
        fchdir(3)                               = 0
        close(3)                                = 0
        execve("/bin/bash", ["-bash"], 0x7fff1ff8eb60 /* 31 vars */) = 0

Unfortunately, I'm not sure if this is the right way in all cases. 

Eric?


Examples:

*** I have simple chroot directory:

        ls -la /mnt/test/chroot
        total 20
        drwxr-xr-x    5 root root 4096 Nov  3 13:10 .
        drwxr-xr-x.   8 root root 4096 Nov  2 15:36 ..
        lrwxrwxrwx    1 root root    8 Nov  2 15:40 bin -> /usr/bin
        lrwxrwxrwx    1 root root    8 Nov  2 15:40 lib -> /usr/lib
        lrwxrwxrwx    1 root root   10 Nov  2 15:40 lib64 -> /usr/lib64
        drwxr-xr-x    4 root root 4096 Nov  3 13:22 namespaces
        dr-xr-xr-x  330 root root    0 Sep 26 22:17 proc
        lrwxrwxrwx    1 root root    9 Nov  2 15:40 sbin -> /usr/sbin
        drwxr-xr-x.  14 root root 4096 Aug 16 10:50 usr

where is bind mounted /usr and mounted /proc

        # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION --submounts /mnt/test/chroot
        TARGET                  SOURCE                      FSTYPE PROPAGATION
        /mnt/test/chroot        /dev/sda4[/mnt/test/chroot] ext4   private
        ├─/mnt/test/chroot/usr  /dev/sda4[/usr]             ext4   shared
        └─/mnt/test/chroot/proc proc                        proc   private

let's enter the root and create persistent mount namespace within the chroot:

        # chroot /mnt/test/chroot
        # unshare --mount=namespaces/mnt

our mount table:

        findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
        TARGET  SOURCE                      FSTYPE PROPAGATION
        /       /dev/sda4[/mnt/test/chroot] ext4   private
        ├─/usr  /dev/sda4[/usr]             ext4   private
        └─/proc proc                        proc   private

and our mount namespace:

        # ls -la /proc/self/ns | grep mnt
        lrwxrwxrwx 1 0 0 0 Nov  3 12:56 mnt -> mnt:[4026532457]

our pid:

        # echo $$
        14411

IMHO good idea is keep the shell alive in the chroot and use another session 
to play with nsenter.

*** nsenter examples:

a) let's try it by PID, all works as expected:

        # nsenter --target 14411 --mount --root --wd

        # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
        TARGET  SOURCE                      FSTYPE PROPAGATION
        /       /dev/sda4[/mnt/test/chroot] ext4   private
        ├─/usr  /dev/sda4[/usr]             ext4   private
        └─/proc proc                        proc   private

        # ls -la /proc/self/ns | grep mnt
        lrwxrwxrwx 1 0 0 0 Nov  3 13:02 mnt -> mnt:[4026532457]

   Important note: in this case nsenter uses /proc/<target>/root for
   chroot(), but the goal is to use persistent namespace where no <target>
   available.

b) let's try chroot() by path:

        # nsenter --target 14411 --mount --root=/mnt/test/chroot --wd=/mnt/test/chroot

        # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION

   failed, mount table is empty

c) let's try chroot by /proc paths:

        # nsenter --target 14411 --mount --root=/mnt/test/chroot/proc/14411/root --wd=/mnt/test/chroot/proc/14411/cwd

        # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
        TARGET  SOURCE                      FSTYPE PROPAGATION
        /       /dev/sda4[/mnt/test/chroot] ext4   private
        ├─/usr  /dev/sda4[/usr]             ext4   private
        └─/proc proc                        proc   private

        # ls -la /proc/self/ns | grep mnt
        lrwxrwxrwx 1 0 0 0 Nov  3 13:09 mnt -> mnt:[4026532457]

   it works!


Note that --target or --mount=<persistent> namespace does not change
anything here.

The nsenter with the patch:


        # ./nsenter --mount=/mnt/test/chroot/namespaces/mnt  --root=/mnt/test/chroot --wd=/mnt/test/chroot

        # findmnt -oTARGET,SOURCE,FSTYPE,PROPAGATION
        TARGET  SOURCE                      FSTYPE PROPAGATION
        /       /dev/sda4[/mnt/test/chroot] ext4   private
        ├─/usr  /dev/sda4[/usr]             ext4   private
        └─/proc proc                        proc   private

        # ls -la /proc/self/ns | grep mnt
        lrwxrwxrwx 1 0 0 0 Nov  3 13:11 mnt -> mnt:[4026532457]

all works as expected. The patch is below.

    Karel


diff --git a/sys-utils/nsenter.c b/sys-utils/nsenter.c
index 9c452c1d1..464f9f98c 100644
--- a/sys-utils/nsenter.c
+++ b/sys-utils/nsenter.c
@@ -238,6 +238,7 @@ int main(int argc, char *argv[])
 	int do_fork = -1; /* unknown yet */
 	uid_t uid = 0;
 	gid_t gid = 0;
+	const char *rd_path = NULL, *wd_path = NULL;
 #ifdef HAVE_LIBSELINUX
 	bool selinux = 0;
 #endif
@@ -318,13 +319,13 @@ int main(int argc, char *argv[])
 			break;
 		case 'r':
 			if (optarg)
-				open_target_fd(&root_fd, "root", optarg);
+				rd_path = optarg;
 			else
 				do_rd = true;
 			break;
 		case 'w':
 			if (optarg)
-				open_target_fd(&wd_fd, "cwd", optarg);
+				wd_path = optarg;
 			else
 				do_wd = true;
 			break;
@@ -433,6 +434,11 @@ int main(int argc, char *argv[])
 		}
 	}
 
+	if (wd_path)
+		open_target_fd(&wd_fd, "cwd", wd_path);
+	if (rd_path)
+		open_target_fd(&root_fd, "root", rd_path);
+
 	/* Remember the current working directory if I'm not changing it */
 	if (root_fd >= 0 && wd_fd < 0) {
 		wd_fd = open(".", O_RDONLY);

    


> I'm trying to write code to work regardless of whether it's run
> inside a chroot, so it would be nice not to have to pass arguments to
> `nsenter(1)` that are specific to chroots, like `chroot <real/path/to/chroot>`.
> It's also a bit counterintuitive to have to re-enter the chroot again.
> 
> Also, these extra steps are not needed with `unshare(1)`, which works fine by
> itself. It's solely re-entering the namespace that seems to be problematic.
> 
> I'm using util-linux 2.30.2-0.1 on Debian. I don't believe it's a problem
> specific to Debian, because everything works when using `unshare(1)` by itself,
> as stated.
> 
> (I haven't tried running this inside a chroot-inside-a-chroot.)
> 
> Details:
> 
> # Below is all run inside a "schroot" session, which is a Debian tool for making chroot use more convenient.
> # I used the instructions here (https://wiki.debian.org/sbuild#Create_the_chroot) to create one.
> 
> ## Preparation for the tests
> 
> # Enter the chroot
> $ sudo schroot -c unstable-amd64-sbuild
> # Set up a private-bind file to hold a handle to our new namespace, as documented in the man page of unshare(1)
> (unstable-amd64-sbuild)root@localhost:/tmp# touch ns-mnt; mount --bind --make-private ns-mnt ns-mnt
> # Set up our test script
> (unstable-amd64-sbuild)root@localhost:/tmp# script='mount; ls /; ls -l /proc/$$/ns/mnt; mount -B /dev/null /etc/hosts; echo hosts:; cat /etc/hosts'
> 
> ## Case 1: unshare(1) with no special options or commands, everything works as expected
> 
> (unstable-amd64-sbuild)root@localhost:/tmp# unshare --mount=ns-mnt sh -ec "$script"
> unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> proc on /proc type proc (rw,relatime)
> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
> [.. etc. other mappings in my chroot ..]
> unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> bin  boot  build  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run	sbin  srv  sys	tmp  usr  var
> lrwxrwxrwx 1 root root 0 Oct 27 17:35 /proc/31691/ns/mnt -> 'mnt:[4026532398]'
> hosts:
> [.. empty hosts (inside the namespace) ..]
> # we are now back outside the namespace
> # if we cat /etc/hosts (both inside and outside the chroot), we see the original
> 
> ## now we try to re-enter the namespace.
> 
> ## Case 2: nsenter(1) with no extra options or commands, doesn't work:
> 
> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt sh -ec "$script"
> [.. mappings for my host system, outside the chroot ..]
> bin  boot  dev	etc  home  initrd.img  initrd.img.old  lib  lib32  lib64  libx32  lost+found  media  mnt  opt  proc  root  run	sbin  selinux  srv  sys  tmp  usr  var	vmlinuz  vmlinuz.old
> [.. aka the / on my host filesystem outside the chroot ..]
> lrwxrwxrwx 1 root root 0 Oct 27 19:36 /proc/32434/ns/mnt -> 'mnt:[4026532398]'
> [.. correct namespace ..]
> hosts:
> [.. empty hosts (inside the namespace) ..]
> # if we cat /etc/hosts outside the namespace, it's non-empty inside the chroot but EMPTY outside the chroot.
> # whoops, because we ran mount -B on the original non-chrooted / filesystem. findmnt says:
> └─/etc/hosts                          udev[/null]                        devtmpfs    rw,nosuid,relatime,size=8181852k,nr_inodes=2045463,mode=755
> # we unmount it before proceeding
> 
> ## Case 3: nsenter(1) with --root, partially works but not really:
> 
> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --root=/ --mount=ns-mnt sh -ec "$script"
> [.. i.e. mount(1) gives empty output ..]
> bin  boot  build  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run	sbin  srv  sys	tmp  usr  var
> [.. at least the root is inside the chroot ..]
> lrwxrwxrwx 1 root root 0 Oct 27 17:37 /proc/878/ns/mnt -> 'mnt:[4026532398]'
> [.. correct namespace ..]
> mount: /etc/hosts: wrong fs type, bad option, bad superblock on /dev/null, missing codepage or helper program, or other error.
> [.. mount operations fail, but the namespace is correct ..]
> [.. if you analyse this case a bit more, you find that /proc/$$/{mounts,mountinfo,mountstats} are all empty ..]
> # exit code 32
> # outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
> 
> ## Case 4: nsenter(1) with explicit chroot(1) call, everything works as expected, again:
> 
> (unstable-amd64-sbuild)root@localhost:/tmp# nsenter --mount=ns-mnt chroot /run/schroot/mount/<<SESSIONID>> sh -ec 'mount && ls /'
> unstable-amd64-sbuild on / type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> proc on /proc type proc (rw,relatime)
> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
> [.. etc. other mappings in my chroot ..]
> unstable-amd64-sbuild on /tmp/ns-mnt type overlay (rw,relatime,lowerdir=/var/lib/schroot/union/underlay/<<SESSIONID>>,...)
> [.. great, we got our mounts back! ..]
> bin  boot  build  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run	sbin  srv  sys	tmp  usr  var
> lrwxrwxrwx 1 root root 0 Oct 27 17:39 /proc/2025/ns/mnt -> 'mnt:[4026532398]'
> [.. correct namespace ..]
> hosts:
> [.. empty hosts, as desired ..]
> # outside the namespace, /etc/hosts is still non-empty, both inside and outside the chroot
> 
> -- 
> GPG: ed25519/56034877E1F87C35
> GPG: rsa4096/1318EFAC5FBBDBCE
> https://github.com/infinity0/pubkeys.git
> --
> To unsubscribe from this list: send the line "unsubscribe util-linux" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
 Karel Zak  <kzak@xxxxxxxxxx>
 http://karelzak.blogspot.com
--
To unsubscribe from this list: send the line "unsubscribe util-linux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux