On Wed, Apr 28, 2021 at 2:22 AM Tycho Andersen <tycho@tycho.pizza> wrote: > > On Tue, Apr 27, 2021 at 04:19:54PM -0700, Andy Lutomirski wrote: > > User notifiers should allow correct emulation. Right now, it doesn't, > > but there is no reason it can't. > > Thanks for the explanation. > > Consider fsmount, which has a, > > ret = mutex_lock_interruptible(&fc->uapi_mutex); > if (ret < 0) > goto err_fsfd; > > If a regular task is interrupted during that wait, it return -EINTR > or whatever back to userspace. > > Suppose that we intercept fsmount. The supervisor decides the mount is > OK, does the fsmount, injects the mount fd into the container, and > then the tracee receives a signal. At this point, the mount fd is > visible inside the container. The supervisor gets a notification about > the signal and revokes the mount fd, but there was some time where it > was exposed in the container, whereas with the interrupt in the native > syscall there was never any exposure. IIUC, this is solved by my patch, patch 4 of the series. The supervisor should do the addfd with the flag added in that patch (SECCOMP_ADDFD_FLAG_SEND) for an atomic "addfd + send". That means when using the atomic "addfd+send" what happens is: either we add the fd _and_ the added fd value is returned to the syscall or the fd is not added at all and the container sees the syscall as interrupted. Therefore, the fd is only visible to the container when it should. Best, Rodrigo _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers