Hi Roman, let me apologize again for the huge delay. I see nothing really wrong in this version, so no objections from me. However, 4/9 doesn't apply, so it seems you will need to make v10 anyway to adapt these changes to the recent changes in kernel/signal.c ;) Just a couple of minor nits below... On 03/16, Roman Gushchin wrote: > > + * If always_leave is not set, and the cgroup is freezing, > + * we're racing with the cgroup freezing. In this case, we don't > + * drop the frozen counter to avoid a transient switch to > + * the unfrozen state. To make sure that the task won't go > + * to the userspace before reaching the signal handler loop, > + * let's set TIF_SIGPENDING flag. > + */ > +void cgroup_leave_frozen(bool always_leave) > +{ > + struct cgroup *cgrp; > + > + spin_lock_irq(&css_set_lock); > + cgrp = task_dfl_cgroup(current); > + if (always_leave || !test_bit(CGRP_FREEZE, &cgrp->flags)) { > + cgroup_dec_frozen_cnt(cgrp); > + cgroup_update_frozen(cgrp); > + WARN_ON_ONCE(!current->frozen); > + current->frozen = false; > + } else { > + set_tsk_thread_flag(current, TIF_SIGPENDING); The setting of TIF_SIGPENDING looks unnecessary and even not correct; because this flag must not be updated without ->siglock held (even if "set" is more or less safe). If JOBCTL_TRAP_FREEZE is already set, then TIF_SIGPENDING must be set too. Otherwise set_tsk_thread_flag(TIF_SIGPENDING) can't help because the task can do recalc_sigpending() at any moment. In particular, get_signal() does dequeue_signal()->recalc_sigpending() right after cgroup_leave_frozen(), so I fail to understand why do we need to set TIF_SIGPENDING. > @@ -912,6 +912,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) > tsk->fail_nth = 0; > #endif > > +#ifdef CONFIG_CGROUPS > + tsk->frozen = 0; > +#endif Hmm, do we really need this? How can a cgroup_task_frozen() task call copy_process() ? > +static void do_freezer_trap(void) > + __releases(¤t->sighand->siglock) > +{ > + /* > + * If a fatal signal is pending, there is no way back for the process, > + * so let it escape from the freezer trap and exit. > + * If the task has been frozen, cgroup_leave_frozen() will be invoked > + * to update the cgroup state, if necessary. > + */ > + if (fatal_signal_pending(current)) { > + current->jobctl &= ~JOBCTL_TRAP_FREEZE; > + spin_unlock_irq(¤t->sighand->siglock); > + return; > + } > + > + /* > + * If there are other trap bits pending except JOBCTL_TRAP_FREEZE, > + * let's make another loop to give it a chance to be handled. > + * In any case, we'll return back. > + */ > + if (((current->jobctl & (JOBCTL_PENDING_MASK | JOBCTL_TRAP_FREEZE)) != > + JOBCTL_TRAP_FREEZE) || fatal_signal_pending(current)) { ^^^^^^^^^^^^^^^^^^^^ We have already checked fatal_signal_pending() at the start? And in fact, you can probably remove fatal_signal_pending() altogether... Note that with recent changes get_signal() does if (signal_group_exit(signal)) { ksig->info.si_signo = signr = SIGKILL; sigdelset(¤t->pending.signal, SIGKILL); recalc_sigpending(); goto fatal; } before the main loop, so afaics fatal_signal_pending() == T in do_freezer_trap() is simply impossible. This means that you can't clear JOBCTL_TRAP_FREEZE, but this is probably fine... if not, you can add jobctl &= ~JOBCTL_TRAP_FREEZE into the "if (signal_group_exit(signal))" above. > @@ -2401,12 +2453,27 @@ bool get_signal(struct ksignal *ksig) > do_signal_stop(0)) > goto relock; > > - if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) { > - do_jobctl_trap(); > - spin_unlock_irq(&sighand->siglock); > + if (unlikely(current->jobctl & > + (JOBCTL_TRAP_MASK | JOBCTL_TRAP_FREEZE))) { > + if (current->jobctl & JOBCTL_TRAP_MASK) { > + do_jobctl_trap(); > + spin_unlock_irq(&sighand->siglock); > + } else if (current->jobctl & JOBCTL_TRAP_FREEZE) > + do_freezer_trap(); > + > goto relock; > } > > + /* > + * If the task is leaving the frozen state, let's update > + * cgroup counters and reset the frozen bit. > + */ > + if (unlikely(cgroup_task_frozen(current))) { > + spin_unlock_irq(&sighand->siglock); > + cgroup_leave_frozen(true); > + spin_lock_irq(&sighand->siglock); I'd suggest to do "goto relock" rather than spin_lock_irq(&sighand->siglock). To ensure we can't miss SIGKILL which can come right after we drop siglock, note again the new signal_group_exit() check above. Oleg.