[patch] Add docs on mount namespace rootfs access and pid namespace pid mapping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The attached 4-patch series adds information to the mount namespaces
and pid namespaces documentation to help users discover how to access
important related information.

1. Elaborate on /proc/[pid]/root and x-ref it
2. Mention /proc/$pid/status NSpid in pid_namespaces
3. Mention pid namespaces /proc/[pid]/root/proc
4. Additional namespaces related x-refs

1): Mention /proc/[pid]/root in mount_namespaces(7) to help people
discover how to access the file system tree seen by a process in
another mount namespace. In the proc (5) entry for it, warn about the
possibly-confusing semantics of readlink() vs following the path in
the vfs layer.

  Adding because I found it difficult to figure out how to access the
file system seen by another process in a disjoint chroot in a
non-ancestor mount namespace.

2): Mention the /proc/[pid]/status NSpid field and related fields in
pid_namespaces (7) to help people discover how to map process IDs
between a parent namespace and any child namespace(s) the process is
in.

  Adding because I found it difficult to discover how to map pids
between namespaces.

3): Mention how /proc/[pid]/root/proc behaves when [pid] is in a
different pid namespace. It's useful to know that you can see another
process's view of procfs via its /proc/[pid]/root link.

4): Some minor cross-references and see-alsos that would've helped me
during unrelated past efforts.
From f99c68f1535dea4f1d926d5a91b1b772557743de Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
Date: Mon, 14 Mar 2022 13:35:38 +0800
Subject: [PATCH v1 1/4] Elaborate on /proc/[pid]/root and x-ref it

Mention /proc/[pid]/{root,cwd,exe,fds} in mount_namespaces (7)
to help users understand how to access the file system tree of
a process in different mount namespace and possibly-disjoint
chroot.

In proc (5) provide a little more detail on how links like
/proc/[pid]/root behave when read with readlink (2) vs when
resolved via kernel vfs layer path lookup. It can be quite confusing
that "readlink /proc/$pid/root" prints "/" so
"ls $(readlink /proc/$pid/root)" has the same result as "ls /" but
"ls /proc/$pid/root/" actually lists the target pid's root.

Signed-off-by: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
---
 man5/proc.5             | 29 ++++++++++++++++++++++++++++-
 man7/mount_namespaces.7 | 14 ++++++++++++++
 2 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/man5/proc.5 b/man5/proc.5
index c6684620e..2eed160e2 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -658,6 +658,12 @@ are not available if the main thread has already terminated
 (typically by calling
 .BR pthread_exit (3)).
 .IP
+If the process is in a chroot and/or a different mount namespace, reading the
+symlink path will return the executable path relative to the process's root.
+Opening the path within the kernel vfs layer will yield the actual executable
+contents even if  the path does may not exist within the currently active mount
+namespace.
+.IP
 Permission to dereference or read
 .RB ( readlink (2))
 this symbolic link is governed by a ptrace access mode
@@ -1830,7 +1836,8 @@ and
 .IP
 Note however that this file is not merely a symbolic link.
 It provides the same view of the filesystem (including namespaces and the
-set of per-process mounts) as the process itself.
+set of per-process mounts) as the process itself
+if dereferenced via the kernel vfs layer.
 An example illustrates this point.
 In one terminal, we start a shell in new user and mount namespaces,
 and in that shell we create some new mounts:
@@ -1866,6 +1873,26 @@ sh2# \fBls /usr | wc \-l\fP                  # /usr in initial NS
 .EE
 .in
 .IP
+If the target process is in a different mount namespace
+and has a different root, following the
+.B /proc/[pid]/root
+link directly will resolve paths relative to the target
+process's root. But
+.BR readlink (2)
+will return the root path as seen from within the target process's mount
+namespace. Tools that canonicalize paths or resolve symbolic links in
+user-space will not be able to see the target process's root. So
+.B ls $(realpath /proc/[pid]/root)
+will expand to
+.B ls /
+and print the root of the invoking shell, but
+.B ls /proc/[pid]/root/
+will list the contents of
+.B /
+as seen by [pid]. See
+.BR mount_namespaces (7)
+for details.
+.IP
 .\" The following was still true as at kernel 2.6.13
 In a multithreaded process, the contents of the
 .I /proc/[pid]/root
diff --git a/man7/mount_namespaces.7 b/man7/mount_namespaces.7
index 7725b341f..98bfd864c 100644
--- a/man7/mount_namespaces.7
+++ b/man7/mount_namespaces.7
@@ -75,6 +75,20 @@ and
 in either mount namespace will not (by default) affect the
 mount list seen in the other namespace
 (but see the following discussion of shared subtrees).
+.PP
+The pseudo-symlinks
+.IR /proc/[pid]/exe ,
+.IR /proc/[pid]/root ,
+.IR /proc/[pid]/fds ,
+and
+.IR /proc/[pid]/cwd
+provide views into the mount namespace of
+.IR [pid]
+from outside that namespace.
+These links provide a way to access the mount namespace seen by another process
+- even if its root is disjoint from the current process's root. See
+.BR proc (5)
+for details and caveats.
 .\"
 .SH SHARED SUBTREES
 After the implementation of mount namespaces was completed,
-- 
2.34.1

From 4dd9d464c6e34c75e9456745b6cf71fd0360db44 Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
Date: Mon, 14 Mar 2022 13:49:19 +0800
Subject: [PATCH v1 2/4] Mention /proc/$pid/status NSpid in pid_namespaces

The pid_namespaces (7) documentation did not explain how to map
a process ID from a parent namespace to the corresponding pid in
a child namespace.

Mention the /proc/$pid/status NSpid field and related fields
in pid_namespaces (7) and add a cross-reference to the details
in proc (5).

Signed-off-by: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
---
 man5/proc.5           |  4 +++-
 man7/pid_namespaces.7 | 20 ++++++++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/man5/proc.5 b/man5/proc.5
index 2eed160e2..5180b4b30 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -2602,7 +2602,9 @@ followed by the value in successively nested inner namespaces.
 .IR NSpid
 Thread ID in each of the PID namespaces of which
 .I [pid]
-is a member.
+is a member. This field provides a mapping between parent and child pids
+for processes in nested
+.BR pid_namespaces (7).
 The fields are ordered as for
 .IR NStgid .
 (Since Linux 4.1.)
diff --git a/man7/pid_namespaces.7 b/man7/pid_namespaces.7
index f99b9abbc..f74b7fccd 100644
--- a/man7/pid_namespaces.7
+++ b/man7/pid_namespaces.7
@@ -355,6 +355,14 @@ yields the process ID of the caller in the PID namespace of the procfs mount
 (i.e., the PID namespace of the process that mounted the procfs).
 This can be useful for introspection purposes,
 when a process wants to discover its PID in other namespaces.
+.PP
+Every process has a mapping of parent-to-child process IDs in
+the
+.B NSpid
+field of its
+.B /proc/$childpid/status
+file. Only pids visible in the pid namespace the procfs is mounted with and any
+child namespaces will be shown.
 .\"
 .\" ============================================================
 .\"
@@ -379,6 +387,18 @@ capability inside the user namespace that owns the PID namespace.
 This makes it possible to determine the PID that is allocated
 to the next process that is created inside this PID namespace.
 .\"
+.TP
+.BR /proc/$pid/status
+The
+.B NStgid, NSpid, NSpgid and NSsid
+.NS
+fields of this file map the process ID and other pid-namespaced attributes of
+.BR $pid
+between the current pid namespace any any child namespaces
+the process is a member of. See
+.BR proc (5)
+for details.
+.\"
 .\" ============================================================
 .\"
 .SS Miscellaneous
-- 
2.34.1

From a2cba8f544167fdaefaa9936c21226a3704d6b73 Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
Date: Mon, 14 Mar 2022 13:35:38 +0800
Subject: [PATCH v1 3/4] Mention pid namespaces /proc/[pid]/root/proc

Add a note in pid_namespaces (7) to explain that
/proc/$pid/root/proc can be used to see the procfs
seen by $pid within its pid namespace.

Signed-off-by: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
---
 man5/proc.5           |  8 ++++++++
 man7/pid_namespaces.7 | 10 ++++++++++
 2 files changed, 18 insertions(+)

diff --git a/man5/proc.5 b/man5/proc.5
index 5180b4b30..87445fd55 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1893,6 +1893,14 @@ as seen by [pid]. See
 .BR mount_namespaces (7)
 for details.
 .IP
+If the target process is in a different pid namespace and has
+.B /proc
+mounted within a mount namespace,
+.B /proc/[pid]/root/proc
+will contain the procfs tree as seen by [pid]. See
+.BR pid_namespaces (7)
+for details.
+.IP
 .\" The following was still true as at kernel 2.6.13
 In a multithreaded process, the contents of the
 .I /proc/[pid]/root
diff --git a/man7/pid_namespaces.7 b/man7/pid_namespaces.7
index f74b7fccd..52a40f544 100644
--- a/man7/pid_namespaces.7
+++ b/man7/pid_namespaces.7
@@ -356,6 +356,16 @@ yields the process ID of the caller in the PID namespace of the procfs mount
 This can be useful for introspection purposes,
 when a process wants to discover its PID in other namespaces.
 .PP
+Processes in parent mount namespaces can see a child process's view
+of its namespace in
+.B /proc/$childpid/root/proc
+if the child process process has
+.B /proc
+mounted within the child namespace.
+See
+.BR proc (5)
+for details.
+.PP
 Every process has a mapping of parent-to-child process IDs in
 the
 .B NSpid
-- 
2.34.1

From 336113bc0d2ea66d44a9f1fb7dee06b04e1cb8da Mon Sep 17 00:00:00 2001
From: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
Date: Mon, 14 Mar 2022 13:52:28 +0800
Subject: [PATCH v1 4/4] Additional namespaces related x-refs

Signed-off-by: Craig Ringer <craig.ringer@xxxxxxxxxxxxxxx>
---
 man5/proc.5             | 9 +++++++++
 man7/mount_namespaces.7 | 1 +
 man7/pid_namespaces.7   | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/man5/proc.5 b/man5/proc.5
index 87445fd55..dbc064996 100644
--- a/man5/proc.5
+++ b/man5/proc.5
@@ -1539,6 +1539,10 @@ process's mount namespace (see
 The format of this file is documented in
 .BR fstab (5).
 .IP
+.BR /proc/[pid]/mountinfo
+provides more detail than
+.BR /proc/[pid]/mounts "."
+.IP
 Since kernel version 2.6.15, this file is pollable:
 after opening the file for reading, a change in this file
 (i.e., a filesystem mount or unmount) causes
@@ -2099,6 +2103,11 @@ This is used by
 It is defined in the kernel source file
 .IR fs/proc/array.c "."
 .IP
+The newer
+.B /proc/[pid]/status
+provides more details and is easier to use than
+.B /proc/[pid]/stat "."
+.IP
 The fields, in order, with their proper
 .BR scanf (3)
 format specifiers, are listed below.
diff --git a/man7/mount_namespaces.7 b/man7/mount_namespaces.7
index 98bfd864c..d206d4bc1 100644
--- a/man7/mount_namespaces.7
+++ b/man7/mount_namespaces.7
@@ -1344,6 +1344,7 @@ See
 .BR pivot_root (2).
 .SH SEE ALSO
 .BR unshare (1),
+.BR nsenter (1),
 .BR clone (2),
 .BR mount (2),
 .BR mount_setattr (2),
diff --git a/man7/pid_namespaces.7 b/man7/pid_namespaces.7
index 52a40f544..f202feedc 100644
--- a/man7/pid_namespaces.7
+++ b/man7/pid_namespaces.7
@@ -436,3 +436,5 @@ See
 .BR namespaces (7),
 .BR user_namespaces (7),
 .BR switch_root (8)
+.BR nsenter (1)
+.BR unshare(1)
-- 
2.34.1


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux