On Fri, May 21, 2021 at 05:14:15PM +0200, Jan Kara wrote: > On Fri 21-05-21 16:52:08, Amir Goldstein wrote: > > On Fri, May 21, 2021 at 4:19 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > On Fri 21-05-21 14:10:32, Amir Goldstein wrote: > > > > On Fri, May 21, 2021 at 1:24 PM Jan Kara <jack@xxxxxxx> wrote: > > > > > > > > > > On Fri 21-05-21 12:41:51, Amir Goldstein wrote: > > > > > > On Fri, May 21, 2021 at 12:22 PM Matthew Bobrowski <repnop@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > Hey Amir/Christian, > > > > > > > > > > > > > > On Thu, May 20, 2021 at 04:43:48PM +0300, Amir Goldstein wrote: > > > > > > > > On Thu, May 20, 2021 at 11:17 AM Christian Brauner > > > > > > > > <christian.brauner@xxxxxxxxxx> wrote: > > > > > > > > > > +#define FANOTIFY_PIDFD_INFO_HDR_LEN \ > > > > > > > > > > + sizeof(struct fanotify_event_info_pidfd) > > > > > > > > > > > > > > > > > > > > static int fanotify_fid_info_len(int fh_len, int name_len) > > > > > > > > > > { > > > > > > > > > > @@ -141,6 +143,9 @@ static int fanotify_event_info_len(unsigned int info_mode, > > > > > > > > > > if (fh_len) > > > > > > > > > > info_len += fanotify_fid_info_len(fh_len, dot_len); > > > > > > > > > > > > > > > > > > > > + if (info_mode & FAN_REPORT_PIDFD) > > > > > > > > > > + info_len += FANOTIFY_PIDFD_INFO_HDR_LEN; > > > > > > > > > > + > > > > > > > > > > return info_len; > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > @@ -401,6 +406,29 @@ static int copy_fid_info_to_user(__kernel_fsid_t *fsid, > > > > > > > > > > return info_len; > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > +static int copy_pidfd_info_to_user(struct pid *pid, > > > > > > > > > > + char __user *buf, > > > > > > > > > > + size_t count) > > > > > > > > > > +{ > > > > > > > > > > + struct fanotify_event_info_pidfd info = { }; > > > > > > > > > > + size_t info_len = FANOTIFY_PIDFD_INFO_HDR_LEN; > > > > > > > > > > + > > > > > > > > > > + if (WARN_ON_ONCE(info_len > count)) > > > > > > > > > > + return -EFAULT; > > > > > > > > > > + > > > > > > > > > > + info.hdr.info_type = FAN_EVENT_INFO_TYPE_PIDFD; > > > > > > > > > > + info.hdr.len = info_len; > > > > > > > > > > + > > > > > > > > > > + info.pidfd = pidfd_create(pid, 0); > > > > > > > > > > + if (info.pidfd < 0) > > > > > > > > > > + info.pidfd = FAN_NOPIDFD; > > > > > > > > > > + > > > > > > > > > > + if (copy_to_user(buf, &info, info_len)) > > > > > > > > > > + return -EFAULT; > > > > > > > > > > > > > > > > > > Hm, well this kinda sucks. The caller can end up with a pidfd in their > > > > > > > > > fd table and when the copy_to_user() failed they won't know what fd it > > > > > > > > > > > > > > > > Good catch! > > > > > > > > > > > > > > Super awesome catch Christian, thanks pulling this up! > > > > > > > > > > > > > > > But I prefer to solve it differently, because moving fd_install() to the > > > > > > > > end of this function does not guarantee that copy_event_to_user() > > > > > > > > won't return an error one day with dangling pidfd in fd table. > > > > > > > > > > > > > > I can see the angle you're approaching this from... > > > > > > > > > > > > > > > It might be simpler to do pidfd_create() next to create_fd() in > > > > > > > > copy_event_to_user() and pass pidfd to copy_pidfd_info_to_user(). > > > > > > > > pidfd can be closed on error along with fd on out_close_fd label. > > > > > > > > > > > > > > > > You also forgot to add CAP_SYS_ADMIN check before pidfd_create() > > > > > > > > (even though fanotify_init() does check for that). > > > > > > > > > > > > > > I didn't really understand the need for this check here given that the > > > > > > > administrative bits are already being checked for in fanotify_init() > > > > > > > i.e. FAN_REPORT_PIDFD can never be set for an unprivileged listener; > > > > > > > thus never walking any of the pidfd_mode paths. Is this just a defense > > > > > > > in depth approach here, or is it something else that I'm missing? > > > > > > > > > > > > > > > > > > > We want to be extra careful not to create privilege escalations, > > > > > > so even if the fanotify fd is leaked or intentionally passed to a less > > > > > > privileged user, it cannot get an open pidfd. > > > > > > > > > > > > IOW, it is *much* easier to be defensive in this case than to prove > > > > > > that the change cannot introduce any privilege escalations. > > > > > > > > > > I have no problems with being more defensive (it's certainly better than > > > > > being too lax) but does it really make sence here? I mean if CAP_SYS_ADMIN > > > > > task opens O_RDWR /etc/passwd and then passes this fd to unpriviledged > > > > > process, that process is also free to update all the passwords. > > > > > Traditionally permission checks in Unix are performed on open and then who > > > > > has fd can do whatever that fd allows... I've tried to follow similar > > > > > philosophy with fanotify as well and e.g. open happening as a result of > > > > > fanotify path events does not check permissions either. > > > > > > > > > > > > > Agreed. > > > > > > > > However, because we had this issue with no explicit FAN_REPORT_PID > > > > we added the CAP_SYS_ADMIN check for reporting event->pid as next > > > > best thing. So now that becomes weird if priv process created fanotify fd > > > > and passes it to unpriv process, then unpriv process gets events with > > > > pidfd but without event->pid. > > > > > > > > We can change the code to: > > > > > > > > if (!capable(CAP_SYS_ADMIN) && !pidfd_mode && > > > > task_tgid(current) != event->pid) > > > > metadata.pid = 0; > > > > > > > > So the case I decscribed above ends up reporting both pidfd > > > > and event->pid to unpriv user, but that is a bit inconsistent... > > > > > > Oh, now I see where you are coming from :) Thanks for explanation. And > > > remind me please, cannot we just have internal FAN_REPORT_PID flag that > > > gets set on notification group when priviledged process creates it and then > > > test for that instead of CAP_SYS_ADMIN in copy_event_to_user()? It is > > > mostly equivalent but I guess more in the spirit of how fanotify > > > traditionally does things. Also FAN_REPORT_PIDFD could then behave in the > > > same way... > > > > Yes, we can. In fact, we should call the internal flag FANOTIFY_UNPRIV > > as it described the situation better than FAN_REPORT_PID. > > This happens to be how I implemented it in the initial RFC [1]. > > > > It's not easy to follow our entire discussion on this thread, but I think > > we can resurrect the FANOTIFY_UNPRIV internal flag and use it > > in this case instead of CAP_SYS_ADMIN. > > I think at that time we were discussing how to handle opening of fds and > we decided to not depend on FANOTIFY_UNPRIV and then I didn't see a value > of that flag because I forgot about pids... Anyway now I agree to go for > that flag. :) Resurrection of this flag SGTM! However, it also sounds like we need to land that series before this PIDFD series or simply incorporate the UNPRIV flag into this one. Will chat with Amir to get this done. /M