On 5/27/21 9:21 AM, Olivier Langlois wrote: > On Thu, 2021-05-27 at 07:46 -0600, Jens Axboe wrote: >> On 5/26/21 12:21 PM, Olivier Langlois wrote: >>> This is required for proper core dump generation. >>> >>> Because signals are normally serviced before resuming userspace and >>> an >>> io_worker thread will never resume userspace, it needs to >>> explicitly >>> call the signal servicing functions. >>> >>> Also, notice that it is possible to exit from the io_wqe_worker() >>> function main loop while having a pending signal such as when >>> the IO_WQ_BIT_EXIT bit is set. >>> >>> It is crucial to service any pending signal before calling >>> do_exit() >>> Proper coredump generation is relying on PF_SIGNALED to be set. >>> >>> More specifically, exit_mm() is using this flag to wait for the >>> core dump completion before releasing its memory descriptor. >>> >>> Signed-off-by: Olivier Langlois <olivier@xxxxxxxxxxxxxx> >>> --- >>> fs/io-wq.c | 22 ++++++++++++++++++++-- >>> 1 file changed, 20 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/io-wq.c b/fs/io-wq.c >>> index 5361a9b4b47b..b76c61e9aff2 100644 >>> --- a/fs/io-wq.c >>> +++ b/fs/io-wq.c >>> @@ -9,8 +9,6 @@ >>> #include <linux/init.h> >>> #include <linux/errno.h> >>> #include <linux/sched/signal.h> >>> -#include <linux/mm.h> >>> -#include <linux/sched/mm.h> >>> #include <linux/percpu.h> >>> #include <linux/slab.h> >>> #include <linux/rculist_nulls.h> >>> @@ -193,6 +191,26 @@ static void io_worker_exit(struct io_worker >>> *worker) >>> >>> kfree_rcu(worker, rcu); >>> io_worker_ref_put(wqe->wq); >>> + /* >>> + * Because signals are normally serviced before resuming >>> userspace and an >>> + * io_worker thread will never resume userspace, it needs >>> to explicitly >>> + * call the signal servicing functions. >>> + * >>> + * Also notice that it is possible to exit from the >>> io_wqe_worker() >>> + * function main loop while having a pending signal such as >>> when >>> + * the IO_WQ_BIT_EXIT bit is set. >>> + * >>> + * It is crucial to service any pending signal before >>> calling do_exit() >>> + * Proper coredump generation is relying on PF_SIGNALED to >>> be set. >>> + * >>> + * More specifically, exit_mm() is using this flag to wait >>> for the >>> + * core dump completion before releasing its memory >>> descriptor. >>> + */ >>> + if (signal_pending(current)) { >>> + struct ksignal ksig; >>> + >>> + get_signal(&ksig); >>> + } >>> do_exit(0); >>> } >> >> Do we need the same thing in fs/io_uring.c:io_sq_thread()? >> > Jens, > > You are 100% correct. In fact, this is the same problem for ALL > currently existing and future io threads. Therefore, I start to think > that the right place for the fix might be straight into do_exit()... That is what I was getting at. To avoid poluting do_exit() with it, I think it'd be best to add an io_thread_exit() that simply does: void io_thread_exit(void) { if (signal_pending(current)) { struct ksignal ksig; get_signal(&ksig); } do_exit(0); } and convert the do_exit() calls in io_uring/io-wq to io_thread_exit() instead. -- Jens Axboe