On 19 12:06:20, Jens Axboe wrote: > On 9/19/24 10:47 AM, Jan Hendrik Farr wrote: > >> [...] > >> > >> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > >> index 75f0087183e5..56097627eafc 100644 > >> --- a/io_uring/io_uring.c > >> +++ b/io_uring/io_uring.c > >> @@ -2472,7 +2472,7 @@ static inline int io_cqring_wait_schedule(struct io_ring_ctx *ctx, > >> return 1; > >> if (unlikely(!llist_empty(&ctx->work_llist))) > >> return 1; > >> - if (unlikely(test_thread_flag(TIF_NOTIFY_SIGNAL))) > >> + if (unlikely(task_work_pending(current))) > >> return 1; > >> if (unlikely(task_sigpending(current))) > >> return -EINTR; > >> diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h > >> index 9d70b2cf7b1e..2fbf0ea9c171 100644 > >> --- a/io_uring/io_uring.h > >> +++ b/io_uring/io_uring.h > >> @@ -308,15 +308,17 @@ static inline int io_run_task_work(void) > >> */ > >> if (test_thread_flag(TIF_NOTIFY_SIGNAL)) > >> clear_notify_signal(); > >> + > >> + if (test_thread_flag(TIF_NOTIFY_RESUME)) { > >> + __set_current_state(TASK_RUNNING); > >> + resume_user_mode_work(NULL); > >> + } > >> + > >> /* > >> * PF_IO_WORKER never returns to userspace, so check here if we have > >> * notify work that needs processing. > >> */ > >> if (current->flags & PF_IO_WORKER) { > >> - if (test_thread_flag(TIF_NOTIFY_RESUME)) { > >> - __set_current_state(TASK_RUNNING); > >> - resume_user_mode_work(NULL); > >> - } > >> if (current->io_uring) { > >> unsigned int count = 0; > >> > >> > > > > Can confirm that also this patch fixes the issue on my end (both with the > > reordering of the task_work and without it). > > Great, thanks for testing! Sent out a v2. No need to test it unless you > absolutely want to ;-) > > > Also found a different way to trigger the issue that does not misuse > > IOSQE_IO_LINK. Do three sends with IOSQE_CQE_SKIP_SUCCESS | IOSQE_IO_LINK > > followed by a close with IOSQE_CQE_SKIP_SUCCESS on a ring with > > IORING_SETUP_DEFER_TASKRUN. > > > > I confirmed that that test case also first brakes on > > 846072f16eed3b3fb4e59b677f3ed8afb8509b89 and is fixed by either of the > > two patches you sent. > > > > Not sure if that's a preferable test case compared to the weirder ealier one. > > You can find it below as a patch to the existing test case in the liburing > > repo: > > I think that's an improvement, just because it doesn't rely on a weird > usage of IOSQE_IO_LINK. And it looks good to me - do you want me to > commit this directly, or do you want to send a "proper" patch (or github > PR) to retain the proper attribution to you? > Sent the PR with one minor change (adjusted the user data for the third send).