On Sat, May 22, 2021 at 12:19:16PM +0300, Amir Goldstein wrote: > Reporting event->pid should depend on the privileges of the user that > initialized the group, not the privileges of the user reading the > events. > > Use an internal group flag FANOTIFY_UNPRIV to record the fact the the > group was initialized by an unprivileged user. > > To be on the safe side, the premissions to setup filesystem and mount > marks now require that both the user that initialized the group and > the user setting up the mark have CAP_SYS_ADMIN. > > Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users") > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx> Thanks for sending through this patch Amir! In general, the patch looks good to me, however there's just a few nits below. > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c > index 71fefb30e015..7df6cba4a06d 100644 > --- a/fs/notify/fanotify/fanotify_user.c > +++ b/fs/notify/fanotify/fanotify_user.c > @@ -424,11 +424,18 @@ static ssize_t copy_event_to_user(struct fsnotify_group *group, > * events generated by the listener process itself, without disclosing > * the pids of other processes. > */ > - if (!capable(CAP_SYS_ADMIN) && > + if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) && > task_tgid(current) != event->pid) > metadata.pid = 0; > > - if (path && path->mnt && path->dentry) { > + /* > + * For now, we require fid mode for unprivileged listener, which does > + * record path events, but keep this check for safety in case we want > + * to allow unprivileged listener to get events with no fd and no fid > + * in the future. > + */ I think it's best if we keep clear of using first person in our comments throughout our code base. Maybe we could change this to: * For now, fid mode is required for an unprivileged listener, which does record path events. However, this check must be kept... > + if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) && > + path && path->mnt && path->dentry) { > fd = create_fd(group, path, &f); > if (fd < 0) > return fd; > @@ -1040,6 +1047,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > int f_flags, fd; > unsigned int fid_mode = flags & FANOTIFY_FID_BITS; > unsigned int class = flags & FANOTIFY_CLASS_BITS; > + unsigned int internal_flags = 0; > > pr_debug("%s: flags=%x event_f_flags=%x\n", > __func__, flags, event_f_flags); > @@ -1053,6 +1061,13 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > */ > if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) || !fid_mode) > return -EPERM; > + > + /* > + * We set the internal flag FANOTIFY_UNPRIV on the group, so we > + * know that we need to limit setting mount/filesystem marks on > + * this group and avoid providing pid and open fd in the event. > + */ Same comment as above applies here. This could be changed to: * Set the internal FANOTIFY_UNPRIV flag for this notification group so that certain restrictions can be enforced upon it. This includes things like not permitting an unprivileged user from setting up mount/filesystem scoped marks and not returning an open file descriptor or pid meta-information within an event. You can make it shorter if you like, but you get the drift. > + internal_flags |= FANOTIFY_UNPRIV; > } > > #ifdef CONFIG_AUDITSYSCALL > @@ -1105,7 +1120,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags) > goto out_destroy_group; > } > > - group->fanotify_data.flags = flags; > + group->fanotify_data.flags = flags | internal_flags; > group->memcg = get_mem_cgroup_from_mm(current->mm); > > group->fanotify_data.merge_hash = fanotify_alloc_merge_hash(); > @@ -1305,11 +1320,13 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask, > group = f.file->private_data; > > /* > - * An unprivileged user is not allowed to watch a mount point nor > - * a filesystem. > + * An unprivileged user is not allowed to setup mount point nor ^ s > + * filesystem marks. It is not allowed to setup those marks for > + * a group that was initialized by an unprivileged user. I think the second sentence would better read as: * This also includes setting up such marks by a group that was intialized by an unprivileged user. > ret = -EPERM; > - if (!capable(CAP_SYS_ADMIN) && > + if ((!capable(CAP_SYS_ADMIN) || > + FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) && ... > diff --git a/fs/notify/fdinfo.c b/fs/notify/fdinfo.c > index a712b2aaa9ac..57f0d5d9f934 100644 > --- a/fs/notify/fdinfo.c > +++ b/fs/notify/fdinfo.c > @@ -144,7 +144,7 @@ void fanotify_show_fdinfo(struct seq_file *m, struct file *f) > struct fsnotify_group *group = f->private_data; > > seq_printf(m, "fanotify flags:%x event-flags:%x\n", > - group->fanotify_data.flags, > + group->fanotify_data.flags & FANOTIFY_INIT_FLAGS, > group->fanotify_data.f_flags); I feel like the internal initialization flags have been dropped off here as FANOTIFY_INIT_FLAGS technically wouldn't cover all flags present in group->fanotify_data.flags with FANOTIFY_UNPRIV, right? > show_fdinfo(m, f, fanotify_fdinfo); > diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h > index bad41bcb25df..f277d1c4e6b8 100644 > --- a/include/linux/fanotify.h > +++ b/include/linux/fanotify.h > @@ -51,6 +51,10 @@ extern struct ctl_table fanotify_table[]; /* for sysctl */ > #define FANOTIFY_INIT_FLAGS (FANOTIFY_ADMIN_INIT_FLAGS | \ > FANOTIFY_USER_INIT_FLAGS) > > +/* Internal flags */ > +#define FANOTIFY_UNPRIV 0x80000000 > +#define FANOTIFY_INTERNAL_FLAGS (FANOTIFY_UNPRIV) Should we be more distinct here i.e. FANOTIFY_INTERNAL_INIT_FLAGS? Just thinking about a possible case where there's some other internal fanotify flags that are used for something else? > #define FANOTIFY_MARK_TYPE_BITS (FAN_MARK_INODE | FAN_MARK_MOUNT | \ > FAN_MARK_FILESYSTEM) /M