On Mon, Apr 01, 2019 at 08:36:26AM -0700, Linus Torvalds wrote: > On Mon, Apr 1, 2019 at 4:41 AM Aleksa Sarai <cyphar@xxxxxxxxxx> wrote: > > > > Eric pitched a procfs2 which would *just* be the PIDs some time ago (in > > an attempt to make it possible one day to mount /proc inside a container > > without adding a bunch of masked paths), though it was just an idea and > > I don't know if he ever had a patch for it. > > I wonder if we really want a fill procfs2, or maybe we could just make No, I don't think we want a full procfs2. > the pidfd readable (yes, it's a directory file descriptor, but we > could allow reading). Hm, if I understand this correctly, then the pidfd we return from pidfd_open() would still be a dirfd but not tied to procfs? So I would implement a "dummy" procfs anon_procfs that is a kernel internal mount from which we allocate inodes, stash struct pid and off to userspace we go? > > What are the *actual* use cases for opening /proc files through it? If > it's really just for a small subset that android wants to do this > (getting basic process state like "running" etc), rather than anything > else, then we could skip the whole /proc linking entirely and go the > other way instead (ie open_pidfd() would get that limited IO model, > and we could make the /proc directory node get the same limited IO > model). >From the original thread where metadata access was apparently very important things that were listed: <quote> And how do you propose, given one of these handle objects, getting a process's current priority, or its current oom score, or its list of memory maps? As I mentioned in my original email, and which nobody has addressed, if you don't use a dirfd as your process handle or you don't provide an easy way to get one of these proc directory FDs, you need to duplicate a lot of metadata access interfaces. <quote> ( https://lore.kernel.org/lkml/CALCETrUFrFKC2YTLH7ViM_7XPYk3LNmNiaz6s8wtWo1pmJQXzg@xxxxxxxxxxxxxx/ ) Joel can probably speak best to this.