Jens Axboe <axboe@xxxxxxxxx> writes: > On 9/25/21 5:05 PM, Linus Torvalds wrote: >> On Sat, Sep 25, 2021 at 1:32 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>> >>> - io-wq core dump exit fix (me) >> >> Hmm. >> >> That one strikes me as odd. >> >> I get the feeling that if the io_uring thread needs to have that >> signal_group_exit() test, something is wrong in signal-land. >> >> It's basically a "fatal signal has been sent to another thread", and I >> really get the feeling that "fatal_signal_pending()" should just be >> modified to handle that case too. > > It did surprise me as well, which is why that previous change ended up > being broken for the coredump case... You could argue that the io-wq > thread should just exit on signal_pending(), which is what we did > before, but that really ends up sucking for workloads that do use > signals for communication purposes. postgres was the reporter here. The primary function get_signal is to make signals not pending. So I don't understand any use of testing signal_pending after a call to get_signal. My confusion doubles when I consider the fact io_uring threads should only be dequeuing SIGSTOP and SIGKILL. I am concerned that an io_uring thread that dequeues SIGKILL won't call signal_group_exit and thus kill the other threads in the thread group. What motivated removing the break and adding the fatal_signal_pending test? Eric