On Sat, Mar 30, 2019 at 2:37 PM Christian Brauner <christian@xxxxxxxxxx> wrote: > > On Sat, Mar 30, 2019 at 12:53:57PM +0100, Jürg Billeter wrote: > > On Fri, 2019-03-29 at 16:54 +0100, Christian Brauner wrote: > > > diff --git a/include/uapi/linux/wait.h b/include/uapi/linux/wait.h > > > index ac49a220cf2a..d6c7c0701997 100644 > > > --- a/include/uapi/linux/wait.h > > > +++ b/include/uapi/linux/wait.h > > > @@ -18,5 +18,7 @@ > > > #define P_PID 1 > > > #define P_PGID 2 > > > > > > +/* Get a file descriptor for /proc/<pid> of the corresponding pidfd > > > */ > > > +#define PIDFD_GET_PROCFD _IOR('p', 1, int) > > > > > > #endif /* _UAPI_LINUX_WAIT_H */ > > > > This is missing an entry in Documentation/ioctl/ioctl-number.txt and is > > actually conflicting with existing entries. > > Thanks. Yes, Jann mentioned this too. > > > > > However, I'd actually prefer a syscall to allow strict whitelisting via > > seccomp and avoid the other ioctl disadvantages that Daniel has already > > mentioned. > > You can filter ioctls with seccomp. > You probably wouldn't even need to, because the only way the ioctl would be useful is to have a dir fd to the procfs root. As such, the pidfd file descriptor itself is useless with the ioctl. There's also no filtering to be done, as one pidfd strictly maps to a specific task, so it's not that you get access to other things than what you weren't permitted to, and that's pretty neat the way it is. If /proc is not mounted in its namespace, you'd need to pass it to the process explicitly, and if it is, then it doesn't matter anyway (even if it can open /proc, hidepid based restrictions still work -- it's essentially a race free openat).