On 9/27/21 9:13 AM, Eric W. Biederman wrote: > Jens Axboe <axboe@xxxxxxxxx> writes: > >> On 9/27/21 8:29 AM, Jens Axboe wrote: >>> On 9/27/21 7:51 AM, Eric W. Biederman wrote: >>>> Jens Axboe <axboe@xxxxxxxxx> writes: >>>> >>>>> On 9/25/21 5:05 PM, Linus Torvalds wrote: >>>>>> On Sat, Sep 25, 2021 at 1:32 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>>>>> >>>>>>> - io-wq core dump exit fix (me) >>>>>> >>>>>> Hmm. >>>>>> >>>>>> That one strikes me as odd. >>>>>> >>>>>> I get the feeling that if the io_uring thread needs to have that >>>>>> signal_group_exit() test, something is wrong in signal-land. >>>>>> >>>>>> It's basically a "fatal signal has been sent to another thread", and I >>>>>> really get the feeling that "fatal_signal_pending()" should just be >>>>>> modified to handle that case too. >>>>> >>>>> It did surprise me as well, which is why that previous change ended up >>>>> being broken for the coredump case... You could argue that the io-wq >>>>> thread should just exit on signal_pending(), which is what we did >>>>> before, but that really ends up sucking for workloads that do use >>>>> signals for communication purposes. postgres was the reporter here. >>>> >>>> The primary function get_signal is to make signals not pending. So I >>>> don't understand any use of testing signal_pending after a call to >>>> get_signal. >>>> >>>> My confusion doubles when I consider the fact io_uring threads should >>>> only be dequeuing SIGSTOP and SIGKILL. >>>> >>>> I am concerned that an io_uring thread that dequeues SIGKILL won't call >>>> signal_group_exit and thus kill the other threads in the thread group. >>>> >>>> What motivated removing the break and adding the fatal_signal_pending >>>> test? >>> >>> I played with this a bit this morning, and I agree it doesn't seem to be >>> needed at all. The original issue was with postgres, I'll give that a >>> whirl as well and see if we run into any unwarranted exits. My simpler >>> test case did not. >> >> Ran the postgres test, and we get tons of io-wq exiting on get_signal() >> returning true. Took a closer look, and it actually looks very much >> expected, as it's a SIGKILL to the original task. >> >> So it looks like I was indeed wrong, and this probably masked the >> original issue that was fixed in that series. I've been running with >> this: >> >> diff --git a/fs/io-wq.c b/fs/io-wq.c >> index c2360cdc403d..afd1db8e000d 100644 >> --- a/fs/io-wq.c >> +++ b/fs/io-wq.c >> @@ -584,10 +584,9 @@ static int io_wqe_worker(void *data) >> >> if (!get_signal(&ksig)) >> continue; >> - if (fatal_signal_pending(current) || >> - signal_group_exit(current->signal)) >> - break; >> - continue; >> + if (ksig.sig != SIGKILL) >> + printk("exit on sig! fatal? %d, sig=%d\n", fatal_signal_pending(current), ksig.sig); >> + break; >> } >> last_timeout = !ret; >> } >> >> and it's running fine and, as expected, we don't generate any printk >> activity as these are all fatal deliveries to the parent. > > Good. So just a break should be fine. Indeed, I'll send out a patch for that. > A little bit of me is concerned about not calling do_group_exit in this > case. Fortunately it is not a problem as complete_signal kills all of > the threads in a signal_group when SIGKILL is delivered. > > So at least until something else is refactored and io_uring threads > unblock another fatal signal all is well. Should we put a comment in io-wq to that effect? I don't see why we'd ever unblock other signals there, but... -- Jens Axboe