On Fri, Apr 16, 2021 at 2:22 AM Matthew Bobrowski <repnop@xxxxxxxxxx> wrote: > > Introduce a new flag FAN_REPORT_PIDFD for fanotify_init(2) which > allows userspace applications to control whether a pidfd is to be > returned instead of a pid for `struct fanotify_event_metadata.pid`. > > FAN_REPORT_PIDFD is mutually exclusive with FAN_REPORT_TID as the > pidfd API is currently restricted to only support pidfd generation for > thread-group leaders. Attempting to set them both when calling > fanotify_init(2) will result in -EINVAL being returned to the > caller. As the pidfd API evolves and support is added for tids, this > is something that could be relaxed in the future. > > If pidfd creation fails, the pid in struct fanotify_event_metadata is > set to FAN_NOPIDFD(-1). Hi Matthew, All in all looks good, just a few small nits. > Falling back and providing a pid instead of a > pidfd on pidfd creation failures was considered, although this could > possibly lead to confusion and unpredictability within userspace > applications as distinguishing between whether an actual pidfd or pid > was returned could be difficult, so it's best to be explicit. I don't think this should have been even "considered" so I see little value in this paragraph in commit message. > > Signed-off-by: Matthew Bobrowski <repnop@xxxxxxxxxx> > --- > fs/notify/fanotify/fanotify_user.c | 33 +++++++++++++++++++++++++++--- > include/linux/fanotify.h | 2 +- > include/uapi/linux/fanotify.h | 2 ++ > 3 files changed, 33 insertions(+), 4 deletions(-) > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c > index 9e0c1afac8bd..fd8ae88796a8 100644 > --- a/fs/notify/fanotify/fanotify_user.c > +++ b/fs/notify/fanotify/fanotify_user.c > @@ -329,7 +329,7 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group, > struct fanotify_info *info = fanotify_event_info(event); > unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS); > struct file *f = NULL; > - int ret, fd = FAN_NOFD; > + int ret, pidfd, fd = FAN_NOFD; > int info_type = 0; > > pr_debug("%s: group=%p event=%p\n", __func__, group, event); > @@ -340,7 +340,25 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group, > metadata.vers = FANOTIFY_METADATA_VERSION; > metadata.reserved = 0; > metadata.mask = event->mask & FANOTIFY_OUTGOING_EVENTS; > - metadata.pid = pid_vnr(event->pid); > + > + if (FAN_GROUP_FLAG(group, FAN_REPORT_PIDFD) && > + pid_has_task(event->pid, PIDTYPE_TGID)) { > + /* > + * Given FAN_REPORT_PIDFD is to be mutually exclusive with > + * FAN_REPORT_TID, panic here if the mutual exclusion is ever > + * blindly lifted without pidfds for threads actually being > + * supported. > + */ > + WARN_ON(FAN_GROUP_FLAG(group, FAN_REPORT_TID)); Better WARN_ON_ONCE() the outcome of this error is not terrible. Also in the comment above I would not refer to this warning as "panic". > + > + pidfd = pidfd_create(event->pid, 0); > + if (unlikely(pidfd < 0)) > + metadata.pid = FAN_NOPIDFD; > + else > + metadata.pid = pidfd; > + } else { > + metadata.pid = pid_vnr(event->pid); > + } You should rebase your work on: git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git fsnotify and resolve conflicts with "unprivileged listener" code. Need to make sure that pidfd is not reported to an unprivileged listener even if group was initialized by a privileged process. This is a conscious conservative choice that we made for reporting pid info to unprivileged listener that can be revisited in the future. > > if (path && path->mnt && path->dentry) { > fd = create_fd(group, path, &f); > @@ -941,6 +959,15 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > #endif > return -EINVAL; > > + /* > + * A pidfd can only be returned for a thread-group leader; thus > + * FAN_REPORT_TID and FAN_REPORT_PIDFD need to be mutually > + * exclusive. Once the pidfd API supports the creation of pidfds on > + * individual threads, then we can look at removing this conditional. > + */ > + if ((flags & FAN_REPORT_PIDFD) && (flags & FAN_REPORT_TID)) > + return -EINVAL; > + > if (event_f_flags & ~FANOTIFY_INIT_ALL_EVENT_F_BITS) > return -EINVAL; > > @@ -1312,7 +1339,7 @@ SYSCALL32_DEFINE6(fanotify_mark, > */ > static int __init fanotify_user_setup(void) > { > - BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10); > + BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 11); > BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9); > > fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, > diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h > index 3e9c56ee651f..894740a6f4e0 100644 > --- a/include/linux/fanotify.h > +++ b/include/linux/fanotify.h > @@ -21,7 +21,7 @@ > #define FANOTIFY_FID_BITS (FAN_REPORT_FID | FAN_REPORT_DFID_NAME) > > #define FANOTIFY_INIT_FLAGS (FANOTIFY_CLASS_BITS | FANOTIFY_FID_BITS | \ > - FAN_REPORT_TID | \ > + FAN_REPORT_TID | FAN_REPORT_PIDFD | \ > FAN_CLOEXEC | FAN_NONBLOCK | \ > FAN_UNLIMITED_QUEUE | FAN_UNLIMITED_MARKS) > FAN_REPORT_PIDFD should be added to FANOTIFY_ADMIN_INIT_FLAGS from fsnotify branch. Thanks, Amir.