On Thu, Jul 7, 2016 at 8:26 PM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Thu, 2016-07-07 at 20:00 -0700, Andrew Vagin wrote: >> On Thu, Jul 07, 2016 at 07:16:18PM -0700, Andrew Vagin wrote: >> > On Thu, Jul 07, 2016 at 12:17:35PM -0700, James Bottomley wrote: >> > > On Thu, 2016-07-07 at 20:21 +0200, Michael Kerrisk (man-pages) >> > > wrote: >> > > > On 7 July 2016 at 17:01, James Bottomley >> > > > <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: >> > > [Serge already answered the parenting issue] >> > > > > On Thu, 2016-07-07 at 08:36 -0500, Serge E. Hallyn wrote: >> > > > > > Hm. Probably best-effort based on the process hierarchy. >> > > > > > So >> > > > > > yeah you could probably get a tree into a state that would >> > > > > > be >> > > > > > wrongly recreated. Create a new netns, bind mount it, exit; >> > > > > > Have >> > > > > > another task create a new user_ns, bind mount it, exit; >> > > > > > Third >> > > > > > task setns()s first to the new netns then to the new >> > > > > > user_ns. I >> > > > > > suspect criu will recreate that wrongly. >> > > > > >> > > > > This is a bit pathological, and you have to be root to do it: >> > > > > so >> > > > > root can set up a nesting hierarchy, bind it and destroy the >> > > > > pids >> > > > > but I know of no current orchestration system which does >> > > > > this. >> > > > > >> > > > > Actually, I have to back pedal a bit: the way I currently set >> > > > > up >> > > > > architecture emulation containers does precisely this: I set >> > > > > up the >> > > > > namespaces unprivileged with child mount namespaces, but then >> > > > > I ask >> > > > > root to bind the userns and kill the process that created it >> > > > > so I >> > > > > have a permanent handle to enter the namespace by, so I >> > > > > suspect >> > > > > that when our current orchestration systems get more >> > > > > sophisticated, >> > > > > they might eventually want to do something like this as well. >> > > > > >> > > > > In theory, we could get nsfs to show this information as an >> > > > > option >> > > > > (just add a show_options entry to the superblock ops), but >> > > > > the >> > > > > problem is that although each namespace has a parent user_ns, >> > > > > there's no way to get it without digging in the namespace >> > > > > specific >> > > > > structure. Probably we should restructure to move it into >> > > > > ns_common, then we could display it (and enforce all >> > > > > namespaces >> > > > > having owning user_ns) but it would be a >> > > > >> > > > I'm missing something here. Is it not already the case that all >> > > > namespaces have an owning user_ns? >> > > >> > > Um, yes, I don't believe I said they don't. The problem I >> > > thought you >> > > were having is that there's no way of seeing what it is. >> > > >> > > nsfs is the Namespace fileystem where bound namespaces appear to >> > > a cat >> > > of /proc/self/mounts. It can display any information that's in >> > > ns_common (the common core of namespaces) but the owning user_ns >> > > pointer currently isn't in this structure. Every user namespace >> > > has a >> > > pointer to it, but they're all privately embedded in the >> > > individual >> > > namespace specific structures. What I was proposing was that >> > > since >> > > every current namespace has a pointer somewhere to the owning >> > > user >> > > namespace, we could abstract this out into ns_common so it's now >> > > accessible to be displayed by nsfs, probably as a mount option. >> > >> > James, I am not sure that I understood you correctly. We have one >> > file system for all namespace files, how we can show per-file >> > properties >> > in mount options. I think we can show all required information in >> > fdinfo. We open a namespaces file (/proc/pid/ns/N) and then read >> > /proc/pid/fdinfo/X for it. >> >> Here is a proof-of-concept patch. >> >> How it works: >> >> In [1]: import os >> >> In [2]: fd = os.open("/proc/self/ns/pid", os.O_RDONLY) >> >> In [3]: print open("/proc/self/fdinfo/%d" % fd).read() >> pos: 0 >> flags: 0100000 >> mnt_id: 2 >> userns: 4026531837 >> >> In [4]: print "/proc/self/ns/user -> %s" % >> os.readlink("/proc/self/ns/user") >> /proc/self/ns/user -> user:[4026531837] > > can't you just do > > readlink /proc/self/ns/user | sed 's/.*\[\(.*\)\]/\1/' We can get fdinfo for any ns file. I used /proc/self/ns/pid as an example. Look at another example: [root@fc22-vm ~]# cat /proc/self/mountinfo | grep pid_ns_file 115 38 0:3 pid:[4026532306] /tmp/pid_ns_file rw shared:67 - nsfs nsfs rw In [4]: print open("/proc/self/fdinfo/5").read() pos: 0 flags: 0100000 mnt_id: 115 userns: 4026532305 In [5]: os.readlink("/proc/self/ns/user") Out[5]: 'user:[4026531837]' > > ? > > But what Michael was asking about was the parent user_ns of all the > other namespaces ... I don't think there's any way we can get that out > of any information in /proc/self/ > > James > > > _______________________________________________ > Containers mailing list > Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx > https://lists.linuxfoundation.org/mailman/listinfo/containers -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html