On Wed, Apr 28, 2021 at 7:08 AM Tycho Andersen <tycho@tycho.pizza> wrote: > > On Wed, Apr 28, 2021 at 03:20:02PM +0200, Rodrigo Campos wrote: > > On Wed, Apr 28, 2021 at 1:10 PM Rodrigo Campos <rodrigo@xxxxxxxxxx> wrote: > > > > > > On Wed, Apr 28, 2021 at 2:22 AM Tycho Andersen <tycho@tycho.pizza> wrote: > > > > > > > > On Tue, Apr 27, 2021 at 04:19:54PM -0700, Andy Lutomirski wrote: > > > > > User notifiers should allow correct emulation. Right now, it doesn't, > > > > > but there is no reason it can't. > > > > > > > > Thanks for the explanation. > > > > > > > > Consider fsmount, which has a, > > > > > > > > ret = mutex_lock_interruptible(&fc->uapi_mutex); > > > > if (ret < 0) > > > > goto err_fsfd; > > > > > > > > If a regular task is interrupted during that wait, it return -EINTR > > > > or whatever back to userspace. > > > > > > > > Suppose that we intercept fsmount. The supervisor decides the mount is > > > > OK, does the fsmount, injects the mount fd into the container, and > > > > then the tracee receives a signal. At this point, the mount fd is > > > > visible inside the container. The supervisor gets a notification about > > > > the signal and revokes the mount fd, but there was some time where it > > > > was exposed in the container, whereas with the interrupt in the native > > > > syscall there was never any exposure. > > > > > > IIUC, this is solved by my patch, patch 4 of the series. The > > > supervisor should do the addfd with the flag added in that patch > > > (SECCOMP_ADDFD_FLAG_SEND) for an atomic "addfd + send". > > > > Well, under Andy's proposal handling that is even simpler. If the > > signal is delivered after we added the fd (note that the container > > syscall does not return when the signal arrives, as it happens today, > > it just signals the notifier and continues to wait), we can just > > ignore the signal and return that (if that is the appropriate thing > > for that syscall, but I guess after adding an fd there isn't any other > > reasonable thing to do). > > Yes, agreed. After thinking about this more, my example is bogus: the > kernel doesn't sleep after it installs the fd, so it would ignore any > signals too. > > Even if the kernel *did* sleep after installing the fd, it would still > be correct emulation to install it and then do whatever the kernel did > during that sleep. So I withdraw my objection :) > > Thanks, > > Tycho Great. I'll respin the series and add a SECCOMP_IOCTL_NOTIF_SET_WAIT_KILLABLE command. We can do the other aforementioned optimizations above when specific use cases come up. I would like to work on preemption notification after this lands though. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers