Am 26.03.21 um 15:53 schrieb Jens Axboe: > On 3/26/21 8:45 AM, Stefan Metzmacher wrote: >> Am 26.03.21 um 15:43 schrieb Stefan Metzmacher: >>> Am 26.03.21 um 15:38 schrieb Jens Axboe: >>>> On 3/26/21 7:59 AM, Jens Axboe wrote: >>>>> On 3/26/21 7:54 AM, Jens Axboe wrote: >>>>>>> The KILL after STOP deadlock still exists. >>>>>> >>>>>> In which tree? Sounds like you're still on the old one with that >>>>>> incremental you sent, which wasn't complete. >>>>>> >>>>>>> Does io_wq_manager() exits without cleaning up on SIGKILL? >>>>>> >>>>>> No, it should kill up in all cases. I'll try your stop + kill, I just >>>>>> tested both of them separately and didn't observe anything. I also ran >>>>>> your io_uring-cp example (and found a bug in the example, fixed and >>>>>> pushed), fwiw. >>>>> >>>>> I can reproduce this one! I'll take a closer look. >>>> >>>> OK, that one is actually pretty straight forward - we rely on cleaning >>>> up on exit, but for fatal cases, get_signal() will call do_exit() for us >>>> and never return. So we might need a special case in there to deal with >>>> that, or some other way of ensuring that fatal signal gets processed >>>> correctly for IO threads. >>> >>> And if (fatal_signal_pending(current)) doesn't prevent get_signal() from being called? >> >> Ah, we're still in the first get_signal() from SIGSTOP, correct? > > Yes exactly, we're waiting in there being stopped. So we either need to > check to something ala: > > relock: > + if (current->flags & PF_IO_WORKER && fatal_signal_pending(current)) > + return false; > > to catch it upfront and from the relock case, or add: > > fatal: > + if (current->flags & PF_IO_WORKER) > + return false; > > to catch it in the fatal section. > Or something like io_uring_files_cancel() Maybe change current->pf_io_worker with a generic current->io_thread structure which, has exit hooks, as well as io_wq_worker_sleeping() and io_wq_worker_running(). Maybe create_io_thread would take such an structure as argument instead of a single function pointer. struct io_thread_description { const char *name; int (*thread_fn)(struct io_thread_description *); void (*sleeping_fn)((struct io_thread_description *); void (*running_fn)((struct io_thread_description *); void (*exit_fn)((struct io_thread_description *); }; And then struct io_wq_manager { struct io_thread_description description; ... manager specific stuff... }; metze