Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > On Sat, Sep 25, 2021 at 1:32 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> - io-wq core dump exit fix (me) > > Hmm. > > That one strikes me as odd. > > I get the feeling that if the io_uring thread needs to have that > signal_group_exit() test, something is wrong in signal-land. > > It's basically a "fatal signal has been sent to another thread", and I > really get the feeling that "fatal_signal_pending()" should just be > modified to handle that case too. > > Because what about a number of other situations where we have that > "killable" logic (ie "stop waiting for locks or IO if you're just > going to get killed anyway" - things like lock_page_killable() and > friends) > > Adding Eric, Oleg and Al to the participants, so that somebody else can pipe up. > > That piping up may quite possibly be to just tell me I'm being stupid, > and that this is just a result of some io_uring thread thing, and > nobody else has this problem. > > It's commit 87c169665578 ("io-wq: ensure we exit if thread group is > exiting") in my tree. > > Comments? I agree it smells. It smells that there needs to be a test after get_signal returns with a signal from an io_uring thread. As I recall io-wq threads block all signals except for SIGSTOP and SIGKILL. The signal SIGSTOP is never returned from get_signal. So at a first glance every return from get_signal should be SIGKILL and thus fatal. So I don't understand why there needs to be a test at all after get_signal. Further the SIGKILL should have been delivered by get_signal so it should not be pending so I am not certain when fatal_signal_pending. Can someone please explain commit 15e20db2e0ce ("io-wq: only exit on fatal signals") to me? What signals is get_signal returning to be delivered to userspace that are not fatal and that are ok to ignore? I think I am missing something. Eric