On Mon, Apr 1, 2019 at 12:33 AM Christian Brauner <christian@xxxxxxxxxx> wrote: > On Sun, Mar 31, 2019 at 03:16:47PM -0700, Linus Torvalds wrote: > > On Sun, Mar 31, 2019 at 3:03 PM Christian Brauner <christian@xxxxxxxxxx> wrote: > > > Thanks for the input. The problem Jann and I saw with this is that it > > > would be awkward to have the kernel open a file in some procfs instance, > > > since then userspace would have to specify which procfs instance the fd > > > should come from. > > > > I would actually suggest we just make the rules be that the > > pidfd_open() always return the internal /proc entry regardless of any > > mount-point (or any "hidepid") but also suggest that exactly *because* > > it gives you visibility into the target pid, you'd basically require > > the strictest kind of control of the process you're trying to get the > > pidfd of. > > > > Ie likely something along the lines of > > > > ptrace_may_access(task, PTRACE_MODE_ATTACH_REALCREDS) > > I can live with that but I would like to hear what Jann thinks too if > that's ok. Ah, yes. That seems reasonable. And, as Linus said, pidfd_open() is less important if you can just do open("/proc/...") on systems with procfs instead. One minor detail to keep in mind for the future is that in a straightforward implementation of this concept, if a non-capable process is running in a mount namespace, but in the initial network namespace, without any reachable /proc mount, it will be able to look at information about other processes' network connections by first using pidfd_open() on itself or by using clone(CLONE_PIDFD), then looking at the "net" directory under the resulting file descriptor.