On Thu, Feb 10, 2022 at 07:01:39PM +0100, Jann Horn wrote: > On Thu, Feb 10, 2022 at 6:37 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > On Thu, Feb 10, 2022 at 05:18:39PM +0100, Jann Horn wrote: > > > On Thu, Feb 10, 2022 at 3:53 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > > > Fatal SIGSYS signals were not being delivered to pid namespace init > > > > processes. Make sure the SIGNAL_UNKILLABLE doesn't get set for these > > > > cases. > > > > > > > > Reported-by: Robert Święcki <robert@xxxxxxxxxxx> > > > > Suggested-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> > > > > Fixes: 00b06da29cf9 ("signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed") > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx> > > > > --- > > > > kernel/signal.c | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/signal.c b/kernel/signal.c > > > > index 38602738866e..33e3ee4f3383 100644 > > > > --- a/kernel/signal.c > > > > +++ b/kernel/signal.c > > > > @@ -1342,9 +1342,10 @@ force_sig_info_to_task(struct kernel_siginfo *info, struct task_struct *t, > > > > } > > > > /* > > > > * Don't clear SIGNAL_UNKILLABLE for traced tasks, users won't expect > > > > - * debugging to leave init killable. > > > > + * debugging to leave init killable, unless it is intended to exit. > > > > */ > > > > - if (action->sa.sa_handler == SIG_DFL && !t->ptrace) > > > > + if (action->sa.sa_handler == SIG_DFL && > > > > + (!t->ptrace || (handler == HANDLER_EXIT))) > > > > t->signal->flags &= ~SIGNAL_UNKILLABLE; > > > > > > You're changing the subclause: > > > > > > !t->ptrace > > > > > > to: > > > > > > (!t->ptrace || (handler == HANDLER_EXIT)) > > > > > > which means that the change only affects cases where the process has a > > > ptracer, right? That's not the scenario the commit message is talking > > > about... > > > > Sorry, yes, I was not as accurate as I should have been in the commit > > log. I have changed it to: > > > > Fatal SIGSYS signals (i.e. seccomp RET_KILL_* syscall filter actions) > > were not being delivered to ptraced pid namespace init processes. Make > > sure the SIGNAL_UNKILLABLE doesn't get set for these cases. > > So basically force_sig_info() is trying to figure out whether > get_signal() will later on check for SIGNAL_UNKILLABLE (the SIG_DFL > case), and if so, it clears the flag from the target's signal_struct > that marks the process as unkillable? > > This used to be: > > if (action->sa.sa_handler == SIG_DFL) > t->signal->flags &= ~SIGNAL_UNKILLABLE; > > Then someone noticed that in the ptrace case, the signal might not > actually end up being consumed by the target process, and added the > "&& !t->ptrace" clause in commit > eb61b5911bdc923875cde99eb25203a0e2b06d43. > > And now Robert Swiecki noticed that that still didn't accurately model > what'll happen in get_signal(). > > This seems hacky to me, and also racy: What if, while you're going > through a SECCOMP_RET_KILL_PROCESS in an unkillable process, some > other thread e.g. concurrently changes the disposition of SIGSYS from > a custom handler to SIG_DFL? Do you mean after force_sig_info_to_task() has finished but before get_signal()? SA_IMMUTABLE will block changes to the action. If you mean before force_sig_info_to_task(), I don't see how that's possible since it's under lock: if (blocked || ignored || (handler != HANDLER_CURRENT)) { action->sa.sa_handler = SIG_DFL; if (handler == HANDLER_EXIT) action->sa.sa_flags |= SA_IMMUTABLE; ... if (action->sa.sa_handler == SIG_DFL && (!t->ptrace || (handler == HANDLER_EXIT))) t->signal->flags &= ~SIGNAL_UNKILLABLE; Given handler = HANDLER_EXIT, it'll always be SIG_DFL. > Instead of trying to figure out whether the signal would have been > fatal without SIGNAL_UNKILLABLE, I think it would be better to find a > way to tell the signal-handling code that SIGNAL_UNKILLABLE should be > bypassed for this specific signal, or something along those lines... > but of course that's also kind of messy because the signal-sending > code might fall back to just using the pending signal mask on > allocation failure IIRC? My original patch aimed that way: diff --git a/kernel/signal.c b/kernel/signal.c index 9b04631acde8..c124a09de6de 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2787,7 +2787,8 @@ bool get_signal(struct ksignal *ksig) * case, the signal cannot be dropped. */ if (unlikely(signal->flags & SIGNAL_UNKILLABLE) && - !sig_kernel_only(signr)) + !sig_kernel_only(signr) && + !(ka->sa.sa_flags & SA_IMMUTABLE)) continue; if (sig_kernel_stop(signr)) { But I don't think there's a race, and Eric's suggestion seemed better in the sense that the state change is entirely contained by force_sig_info_to_task(). -Kees -- Kees Cook