On Sat, Oct 24, 2020 at 2:53 PM Michael Kerrisk (man-pages) <mtk.manpages@xxxxxxxxx> wrote: > On 10/17/20 2:25 AM, Jann Horn wrote: > > On Fri, Oct 16, 2020 at 8:29 PM Michael Kerrisk (man-pages) > > <mtk.manpages@xxxxxxxxx> wrote: [...] > >> I'm not sure if I should write anything about this small UAPI > >> breakage in BUGS, or not. Your thoughts? > > > > Thinking about it a bit more: Any code that relies on pause() or > > epoll_wait() not restarting is buggy anyway, right? Because a signal > > could also arrive directly before entering the syscall, while > > userspace code is still executing? So one could argue that we're just > > enlarging a preexisting race. (Unless the signal handler checks the > > interrupted register state to figure out whether we already entered > > syscall handling?) > > Yes, that all makes sense. > > > If userspace relies on non-restarting behavior, it should be using > > something like epoll_pwait(). And that stuff only unblocks signals > > after we've already past the seccomp checks on entry. > > Thanks for elaborating that detail, since as soon as you talked > about "enlarging a preexisting race" above, I immediately wondered > sigsuspend(), pselect(), etc. > > (Mind you, I still wonder about the effect on system calls that > are normally nonrestartable because they have timeouts. My > understanding is that the kernel doesn't restart those system > calls because it's impossible for the kernel to restart the call > with the right timeout value. I wonder what happens when those > system calls are restarted in the scenario we're discussing.) Ah, that's an interesting edge case... > Anyway, returning to your point... So, to be clear (and to > quickly remind myself in case I one day reread this thread), > there is not a problem with sigsuspend(), pselect(), ppoll(), > and epoll_pwait() since: > > * Before the syscall, signals are blocked in the target. > * Inside the syscall, signals are still blocked at the time > the check is made for seccomp filters. > * If a seccomp user-space notification event kicks, the target > is put to sleep with the signals still blocked. > * The signal will only get delivered after the supervisor either > triggers a spoofed success/failure return in the target or the > supervisor sends a CONTINUE response to the kernel telling it > to execute the target's system call. Either way, there won't be > any restarting of the target's system call (and the supervisor > thus won't see multiple notifications). > > (Right?) Yeah. [...] > > So we should probably document the restarting behavior as something > > the supervisor has to deal with in the manpage; but for the > > "non-restarting syscalls can restart from the target's perspective" > > aspect, it might be enough to document this as quirky behavior that > > can't actually break correct code? (Or not document it at all. Dunno.) > > So, I've added the following to the page: > > Interaction with SA_RESTART signal handlers > Consider the following scenario: > > · The target process has used sigaction(2) to install a signal > handler with the SA_RESTART flag. > > · The target has made a system call that triggered a seccomp user- > space notification and the target is currently blocked until the > supervisor sends a notification response. > > · A signal is delivered to the target and the signal handler is > executed. > > · When (if) the supervisor attempts to send a notification > response, the SECCOMP_IOCTL_NOTIF_SEND ioctl(2)) operation will > fail with the ENOENT error. > > In this scenario, the kernel will restart the target's system > call. Consequently, the supervisor will receive another user- > space notification. Thus, depending on how many times the blocked > system call is interrupted by a signal handler, the supervisor may > receive multiple notifications for the same system call in the > target. > > One oddity is that system call restarting as described in this > scenario will occur even for the blocking system calls listed in > signal(7) that would never normally be restarted by the SA_RESTART > flag. > > Does that seem okay? Sounds good to me. > In addition, I've queued a cross-reference in signal(7): > > In certain circumstances, the seccomp(2) user-space notifi‐ > cation feature can lead to restarting of system calls that > would otherwise never be restarted by SA_RESTART; for > details, see seccomp_user_notif(2). _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers