Re: [PATCH] cgroup: freezer: Don't wake up process really blocked on signal

Oleg Nesterov <oleg@xxxxxxxxxx> · Wed, 18 Sep 2019 13:47:45 +0200

On 09/18, Honglei Wang wrote:
>
>
> On 9/17/19 9:47 PM, Oleg Nesterov wrote:
> >>>int main(int argc, char *argv[])
> >>>{
> >>>         sigset_t waitset;
> >>>         int signo;
> >>>
> >>>         sigemptyset(&waitset);
> >>>         sigaddset(&waitset, SIGINT);
> >>>         sigaddset(&waitset, SIGUSR1);
> >>>
> >>>         pthread_sigmask(SIG_BLOCK, &waitset, NULL);
> >>>
> >>>         for (;;) {
> >>>                 signo = sigwaitinfo(&waitset, NULL);
> >>>                 if (signo < 0)
> >>>                         err(1, "sigwaitinfo() failed");
> >>>
> >>>                 if (signo == SIGUSR1)
> >>>                         printf("Receive SIGUSR1\n");
> >>>                 else
> >>>                         break;
> >>>         }
> >>>
> >>>         return 0;
> >>>}
> >
> >Well, I think we do not care. Userspace should handle -EINTR and restart
> >if necessary. And afaics this test case can equally "fail" if it races
> >with the system freezer, freeze_task() can be called between
> >set_current_state(TASK_INTERRUPTIBLE) and freezable_schedule() in
> >do_sigtimedwait().
> >
> >OTOH, this is another indication that kernel/cgroup/freezer.c should
> >interact with freezer_do_not_count/freezer_count somehow.
> >
>
> It looks weird if a task is happily waiting for the desired signal, but is
> woke up by an move from one cgroup to another (which is really not related
> to any signal action from the point of userspace), doesn't it?

Again, imo a sane application should handle -EINTR. Nothing really weird.

There are other reasons why sigwaitinfo() can return -EINTR. Say, another
thread can steal a signal. Or, simply start you test-case and run
strace -p.

If you want to "fix" sigwaitinfo() - just make it restartable, that is all.
But see above, I don't think this makes any sense.

> At this point, seems we need to be sure if it's necessary for cgroup to fix
> this problem or not.

Well, to me this particular problem simply doesn't exist ;) However, cgroup
freezer has other, more serious problems.

> >>>diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
> >>>index 8cf010680678..08f6abacaa75 100644
> >>>--- a/kernel/cgroup/freezer.c
> >>>+++ b/kernel/cgroup/freezer.c
> >>>@@ -162,10 +162,14 @@ static void cgroup_freeze_task(struct task_struct *task, bool freeze)
> >>>  	if (freeze) {
> >>>  		task->jobctl |= JOBCTL_TRAP_FREEZE;
> >>>-		signal_wake_up(task, false);
> >>>+
> >>>+		if (sigisemptyset(&task->real_blocked))
> >>>+			signal_wake_up(task, false);
> >
> >This can't really help, ->real_blocked can be empty even if this task
> >sleeps in sigwaitinfo().
> >
>
> Sorry, I'm not quit clear about the 'sleep'. sigwaitinfo() sets
> ->real_blocked in do_sigtimedwait() and it goes to
> freezable_schedule_hrtimeout_range() soon. After the task is woke up,
> ->real_blocked is cleared. Seems there is no chance to set ->real_blocked to
> empty in this path. Do I miss something important here? Please correct me.

Hmm. do_sigtimedwait() does

	tsk->real_blocked = tsk->blocked;

Now, if tsk->blocked was empty (sigisemptyset() == T), then why do you think
the sigisemptyset(real_blocked) check in cgroup_freeze_task() won't be true?

But in fact this doesn't really matter. This patch is wrong in any case.
sigwaitinfo() must not block cgroup freezer.

Oleg.