On Sun, Dec 11, 2016 at 12:54:56PM +0100, Michael Kerrisk (man-pages) wrote: > [was: [PATCH 0/4 v3] Add an interface to discover relationships > between namespaces] > > Hello Andrei > > See below for my attempt to document the following. Hi Michael, Eric already did my work:). I have read this documentation and it looks good for me. I have nothing to add to Eric's comments. Thanks, Andrei > > On 6 September 2016 at 09:47, Andrei Vagin <avagin@xxxxxxxxxx> wrote: > > From: Andrey Vagin <avagin@xxxxxxxxxx> > > > > Each namespace has an owning user namespace and now there is not way > > to discover these relationships. > > > > Pid and user namepaces are hierarchical. There is no way to discover > > parent-child relationships too. > > > > Why we may want to know relationships between namespaces? > > > > One use would be visualization, in order to understand the running > > system. Another would be to answer the question: what capability does > > process X have to perform operations on a resource governed by namespace > > Y? > > > > One more use-case (which usually called abnormal) is checkpoint/restart. > > In CRIU we are going to dump and restore nested namespaces. > > > > There [1] was a discussion about which interface to choose to determing > > relationships between namespaces. > > > > Eric suggested to add two ioctl-s [2]: > >> Grumble, Grumble. I think this may actually a case for creating ioctls > >> for these two cases. Now that random nsfs file descriptors are bind > >> mountable the original reason for using proc files is not as pressing. > >> > >> One ioctl for the user namespace that owns a file descriptor. > >> One ioctl for the parent namespace of a namespace file descriptor. > > > > Here is an implementaions of these ioctl-s. > > > > $ man man7/namespaces.7 > > ... > > Since Linux 4.X, the following ioctl(2) calls are supported for > > namespace file descriptors. The correct syntax is: > > > > fd = ioctl(ns_fd, ioctl_type); > > > > where ioctl_type is one of the following: > > > > NS_GET_USERNS > > Returns a file descriptor that refers to an owning user names‐ > > pace. > > > > NS_GET_PARENT > > Returns a file descriptor that refers to a parent namespace. > > This ioctl(2) can be used for pid and user namespaces. For > > user namespaces, NS_GET_PARENT and NS_GET_USERNS have the same > > meaning. > > > > In addition to generic ioctl(2) errors, the following specific ones > > can occur: > > > > EINVAL NS_GET_PARENT was called for a nonhierarchical namespace. > > > > EPERM The requested namespace is outside of the current namespace > > scope. > > > > [1] https://lkml.org/lkml/2016/7/6/158 > > [2] https://lkml.org/lkml/2016/7/9/101 > > The following is the text I propose to add to the namespaces(7) page. > Could you please review and let me know of corrections and > improvements. > > Thanks, > > Michael > > > Introspecting namespace relationships > Since Linux 4.9, two ioctl(2) operations are provided to allow > introspection of namespace relationships (see user_namespaces(7) > and pid_namespaces(7)). The form of the calls is: > > ioctl(fd, request); > > In each case, fd refers to a /proc/[pid]/ns/* file. > > NS_GET_USERNS > Returns a file descriptor that refers to the owning user > namespace for the namespace referred to by fd. > > NS_GET_PARENT > Returns a file descriptor that refers to the parent names‐ > pace of the namespace referred to by fd. This operation is > valid only for hierarchical namespaces (i.e., PID and user > namespaces). For user namespaces, NS_GET_PARENT is synony‐ > mous with NS_GET_USERNS. > > In each case, the returned file descriptor is opened with O_RDONLY > and O_CLOEXEC (close-on-exec). > > By applying fstat(2) to the returned file descriptor, one obtains > a stat structure whose st_ino (inode number) field identifies the > owning/parent namespace. This inode number can be matched with > the inode number of another /proc/[pid]/ns/{pid,user} file to > determine whether that is the owning/parent namespace. > > Either of these ioctl(2) operations can fail with the following > error: > > EPERM The requested namespace is outside of the caller's names‐ > pace scope. This error can occur if, for example, the own‐ > ing user namespace is an ancestor of the caller's current > user namespace. It can also occur on attempts to obtain > the parent of the initial user or PID namespace. > > Additionally, the NS_GET_PARENT operation can fail with the fol‐ > lowing error: > > EINVAL fd refers to a nonhierarchical namespace. > > See the EXAMPLE section for an example of the use of these opera‐ > tions. > > [...] > > EXAMPLE > The example shown below uses the ioctl(2) operations described > above to perform simple introspection of namespace relationships. > The following shell sessions show various examples of the use of > this program. > > Trying to get the parent of the initial user namespace fails, for > the reasons explained earlier: > > $ ./ns_introspect /proc/self/ns/user p > The parent namespace is outside your namespace scope > > Create a process running sleep(1) that resides in new user and UTS > namespaces, and show that new UTS namespace is associated with the > new user namespace: > > $ unshare -Uu sleep 1000 & > [1] 23235 > $ ./ns_introspect /proc/23235/ns/uts > Inode number of owning user namespace is: 4026532448 > $ readlink /proc/23235/ns/user > user:[4026532448] > > Then show that the parent of the new user namespace in the preced‐ > ing example is the initial user namespace: > > $ readlink /proc/self/ns/user > user:[4026531837] > $ ./ns_introspect /proc/23235/ns/user > Inode number of owning user namespace is: 4026531837 > > Start a shell in a new user namespace, and show that from within > this shell, the parent user namespace can't be discovered. Simi‐ > larly, the UTS namespace (which is associated with the initial > user namespace) can't be discovered. > > $ PS1="sh2$ " unshare -U bash > sh2$ ./ns_introspect /proc/self/ns/user p > The parent namespace is outside your namespace scope > sh2$ ./ns_introspect /proc/self/ns/uts u > The owning user namespace is outside your namespace scope > > Program source > > /* ns_introspect.c > > Licensed under GNU General Public License v2 or later > */ > #include <stdlib.h> > #include <unistd.h> > #include <stdio.h> > #include <sys/stat.h> > #include <fcntl.h> > #include <sys/ioctl.h> > #include <string.h> > #include <errno.h> > > #ifndef NS_GET_USERNS > #define NSIO 0xb7 > #define NS_GET_USERNS _IO(NSIO, 0x1) > #define NS_GET_PARENT _IO(NSIO, 0x2) > #endif > > int > main(int argc, char *argv[]) > { > int fd, userns_fd, parent_fd; > struct stat sb; > > if (argc < 2) { > fprintf(stderr, "Usage: %s /proc/[pid]/ns/[file] [p|u]\n", > argv[0]); > fprintf(stderr, "\nDisplay the result of one or both " > "of NS_GET_USERNS (u) or NS_GET_PARENT (p)\n" > "for the specified /proc/[pid]/ns/[file]. If neither " > "'p' nor 'u' is specified,\n" > "NS_GET_USERNS is the default.\n"); > exit(EXIT_FAILURE); > } > > /* Obtain a file descriptor for the 'ns' file specified > in argv[1] */ > > fd = open(argv[1], O_RDONLY); > if (fd == -1) { > perror("open"); > exit(EXIT_FAILURE); > } > > /* Obtain a file descriptor for the owning user namespace and > then obtain and display the inode number of that namespace */ > > if (argc < 3 || strchr(argv[2], 'u')) { > userns_fd = ioctl(fd, NS_GET_USERNS); > > if (userns_fd == -1) { > if (errno == EPERM) > printf("The owning user namespace is outside " > "your namespace scope\n"); > else > perror("ioctl-NS_GET_USERNS"); > exit(EXIT_FAILURE); > } > > if (fstat(userns_fd, &sb) == -1) { > perror("fstat-userns"); > exit(EXIT_FAILURE); > } > printf("Inode number of owning user namespace is: %ld\n", > (long) sb.st_ino); > > close(userns_fd); > } > > /* Obtain a file descriptor for the parent namespace and > then obtain and display the inode number of that namespace */ > > if (argc > 2 && strchr(argv[2], 'p')) { > parent_fd = ioctl(fd, NS_GET_PARENT); > > if (parent_fd == -1) { > if (errno == EINVAL) > printf("Can' get parent namespace of a " > "nonhierarchical namespace\n"); > else if (errno == EPERM) > printf("The parent namespace is outside " > "your namespace scope\n"); > else > perror("ioctl-NS_GET_PARENT"); > exit(EXIT_FAILURE); > } > > if (fstat(parent_fd, &sb) == -1) { > perror("fstat-parentns"); > exit(EXIT_FAILURE); > } > printf("Inode number of parent namespace is: %ld\n", > (long) sb.st_ino); > > close(parent_fd); > } > > exit(EXIT_SUCCESS); > } > > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/ -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html