On Tue, Dec 10, 2019 at 08:07:45AM -0800, Sargun Dhillon wrote: > On Tue, Dec 10, 2019 at 3:10 AM Christian Brauner > <christian.brauner@xxxxxxxxxx> wrote: > > > > [I'm expanding the Cc to a few Firefox and glibc people since we've been > > been talking about replacing SECCOMP_RET_TRAP with > > SECCOMP_RET_USER_NOTIF for a bit now because the useage of > > SECCOMP_RET_TRAP in the broker blocks desirable core glibc changes. > > Even if just for their lurking pleasure. :)] > > > > On Mon, Dec 09, 2019 at 09:46:35PM +0100, Oleg Nesterov wrote: > > > On 12/09, Christian Brauner wrote > > > > > > I agree, and I won't really argue... > > > > > > but the changelog in 2/4 says > > > > > > The requirement that the tracer has attached to the tracee prior to the > > > capture of the file descriptor may be lifted at a later point. > > > > > > so may be we should do this right now? > > > > I think so, yes. This doesn't strike me as premature optimization but > > rather as a core design questions. > > > > > > > > plus this part > > > > > > @@ -1265,7 +1295,8 @@ SYSCALL_DEFINE4(ptrace, long, request, long, pid, unsigned long, addr, > > > } > > > > > > ret = ptrace_check_attach(child, request == PTRACE_KILL || > > > - request == PTRACE_INTERRUPT); > > > + request == PTRACE_INTERRUPT || > > > + request == PTRACE_GETFD); > > > > > > actually means "we do not need ptrace, but we do not know where else we > > > can add this fd_install(get_task_file()). > > > > Right, I totally get your point and I'm not a fan of this being in > > ptrace() either. > > > > The way I see is is that the main use-case for this feature is the > > seccomp notifier and I can see this being useful. So the right place to > > plumb this into might just be seccomp and specifically on to of the > > notifier. > > If we don't care about getting and setting fds at random points of > > execution it might make sense to add new options to the notify ioctl(): > > > > #define SECCOMP_IOCTL_NOTIF_GET_FD SECCOMP_IOWR(3, <sensible struct>) > > #define SECCOMP_IOCTL_NOTIF_SET_FD SECCOMP_IOWR(4, <sensible struct>) > > > > which would let you get and set fds while the supervisee is blocked. > > > > Christian > Doesn't SECCOMP_IOCTL_NOTIF_GET_FD have some ambiguity to it? As Tycho mentioned, this is why we have a the tid of the calling task but we also have a cookie per request. The cookie is useful so that you can do - receive request <chocolate> cookie - open(/proc/<pid>{/mem}) - verify <chocolate> cookie still exists - <chocolate> cookie still exists -> file descriptor refers to correct task - <chocolate> cookie gone -> task has been recycled > Specifically, because > multiple processes can have the same notifier attached to them? If we > choose to go down the > route of introducing an ioctl (which I'm not at all opposed to), I > would rather do it on pidfd. We > can then plumb seccomp notifier to send pidfd instead of raw pid. In > the mean time, folks > can just open up /proc/${PID}, and do the check cookie dance. > > Christian, > As the maintainer of pidfd, what do you think? Let me quote what I wrote to the Mozilla folks today. :) "(One thing that always strikes me is that if my pidfd patches would've been ready back when we did the seccomp notifier we could've added a pidfd argument to the seccomp notifier kernel struct and if a flag is set given back a pidfd alongside the notifier fd. This way none of this revalidting the id stuff would've been necessary and you could also safely translate from a pidfd into a /proc/<pid> directory to e.g. open /proc/<pid>/mem. Anyway, that's not out of scope. One could still write a patch for that to add a pidfd argument under a new flag to the kernel struct. Should be rather trivial.)" So yeah, it crossed my mind. ;) I really would like to have this placed under a flag though... I very much dislike the idea of receiving any kind of fd - _especially a pidfd_ - implicitly. So ideally this would be a flag to the receive ioctl(). Kees just got my SECCOMP_USER_NOTIF_FLAG_CONTINUE patchset merged for v5.5 which adds the #define SECCOMP_USER_NOTIF_FLAG_CONTINUE (1UL << 0) flag which when set in the send case (i.e. supervisor -> kernel) will cause the syscall to be executed. When we add a new flag to get a pidfd it might make sense to rename the CONTINUE flag in master before v5.5 is out to #define SECCOMP_USER_NOTIF_SEND_FLAG_CONTINUE (1UL << 0) to indicate that it's only valid for the SEND ioctl(). Then we add #define SECCOMP_USER_NOTIF_RECV_FLAG_PIDFD (1UL << 0) for v5.6. This way send and receive flags are named differently for clarity. (I don't care about the name being long. Other people might though _shrug_.) Christian