On Mon, Oct 26, 2020 at 10:51:02AM +0100, Jann Horn wrote: > The problem is the scenario where a process is interrupted while it's > waiting for the supervisor to reply. > > Consider the following scenario (with supervisor "S" and target "T"; S > wants to wait for events on two file descriptors seccomp_fd and > other_fd): > > S: starts poll() to wait for events on seccomp_fd and other_fd > T: performs a syscall that's filtered with RET_USER_NOTIF > S: poll() returns and signals readiness of seccomp_fd > T: receives signal SIGUSR1 > T: syscall aborts, enters signal handler > T: signal handler blocks on unfiltered syscall (e.g. write()) > S: starts SECCOMP_IOCTL_NOTIF_RECV > S: blocks because no syscalls are pending Oooh, yes, ew. Thanks for the illustration. Thinking about this from userspace's least-surprise view, I would expect the "recv" to stay "queued", in the sense we'd see this: S: starts poll() to wait for events on seccomp_fd and other_fd T: performs a syscall that's filtered with RET_USER_NOTIF S: poll() returns and signals readiness of seccomp_fd T: receives signal SIGUSR1 T: syscall aborts, enters signal handler T: signal handler blocks on unfiltered syscall (e.g. write()) S: starts SECCOMP_IOCTL_NOTIF_RECV S: gets (stale) seccomp_notif from seccomp_fd S: sends seccomp_notif_resp, receives ENOENT (or some better errno?) This is not at all how things are designed internally right now, but that behavior would work, yes? -- Kees Cook