On Thu, Oct 10, 2024 at 07:56:53AM +1100, Aleksa Sarai wrote: > On 2024-10-09, Jonathan Corbet <corbet@xxxxxxx> wrote: > > luca.boccassi@xxxxxxxxx writes: > > > > > As discussed at LPC24, add an ioctl with an extensible struct > > > so that more parameters can be added later if needed. Start with > > > returning pid/tgid/ppid and creds unconditionally, and cgroupid > > > optionally. > > > > I was looking this over, and a couple of questions came to mind... > > > > > Signed-off-by: Luca Boccassi <luca.boccassi@xxxxxxxxx> > > > --- > > > > [...] > > > > > diff --git a/fs/pidfs.c b/fs/pidfs.c > > > index 80675b6bf884..15cdc7fe4968 100644 > > > --- a/fs/pidfs.c > > > +++ b/fs/pidfs.c > > > @@ -2,6 +2,7 @@ > > > #include <linux/anon_inodes.h> > > > #include <linux/file.h> > > > #include <linux/fs.h> > > > +#include <linux/cgroup.h> > > > #include <linux/magic.h> > > > #include <linux/mount.h> > > > #include <linux/pid.h> > > > @@ -114,6 +115,83 @@ static __poll_t pidfd_poll(struct file *file, struct poll_table_struct *pts) > > > return poll_flags; > > > } > > > > > > +static long pidfd_info(struct task_struct *task, unsigned int cmd, unsigned long arg) > > > +{ > > > + struct pidfd_info __user *uinfo = (struct pidfd_info __user *)arg; > > > + size_t usize = _IOC_SIZE(cmd); > > > + struct pidfd_info kinfo = {}; > > > + struct user_namespace *user_ns; > > > + const struct cred *c; > > > + __u64 request_mask; > > > + > > > + if (!uinfo) > > > + return -EINVAL; > > > + if (usize < sizeof(struct pidfd_info)) > > > + return -EINVAL; /* First version, no smaller struct possible */ > > > + > > > + if (copy_from_user(&request_mask, &uinfo->request_mask, sizeof(request_mask))) > > > + return -EFAULT; > > > > You don't check request_mask for unrecognized flags, so user space will > > not get an error if it puts random gunk there. That, in turn, can make > > it harder to add new options in the future. > > In fairness, this is how statx works and statx does this to not require > syscall retries to figure out what flags the current kernel supports and > instead defers that to stx_mask. pidfd_info overwrites the request_mask with what is supported by the kernel. I don't think userspace setting random stuff in the request_mask is a problem. It would already be a problem with statx() and we haven't seen that so far. If userspace happens to set a some random bit in the request_mask and that bit ends up being used a few kernel releases later to e.g., retrieve additional information then all that happens is that userspace would now receive information they didn't need. That's not a problem. It is of course very different to e.g. adding a random bit in the flag mask of clone3() or mount_setattr() or any system call that changes kernel state based on the passed bits. In that case ignoring unknown bits and then starting to use them is obviously a big problem. The other related problem would be flag deprecation and reuse of a flag which (CLONE_DETACHED -> CLONE_PIDFD) also is only a real problem for system calls that alter kernel state. So overally, I think ignoring uknown bits in the request mask is safe. It needs to be documented of course.