On Sun, 2023-01-29 at 16:17 -0700, Jens Axboe wrote: > On 1/29/23 3:57 PM, Jens Axboe wrote: > > On 1/27/23 6:52?AM, Dylan Yudaken wrote: > > > REQ_F_FORCE_ASYNC was being ignored for re-queueing linked > > > requests. Instead obey that flag. > > > > > > Signed-off-by: Dylan Yudaken <dylany@xxxxxxxx> > > > --- > > > io_uring/io_uring.c | 8 +++++--- > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c > > > index db623b3185c8..980ba4fda101 100644 > > > --- a/io_uring/io_uring.c > > > +++ b/io_uring/io_uring.c > > > @@ -1365,10 +1365,12 @@ void io_req_task_submit(struct io_kiocb > > > *req, bool *locked) > > > { > > > io_tw_lock(req->ctx, locked); > > > /* req->task == current here, checking PF_EXITING is safe > > > */ > > > - if (likely(!(req->task->flags & PF_EXITING))) > > > - io_queue_sqe(req); > > > - else > > > + if (unlikely(req->task->flags & PF_EXITING)) > > > io_req_defer_failed(req, -EFAULT); > > > + else if (req->flags & REQ_F_FORCE_ASYNC) > > > + io_queue_iowq(req, locked); > > > + else > > > + io_queue_sqe(req); > > > } > > > > > > void io_req_task_queue_fail(struct io_kiocb *req, int ret) > > > > This one causes a failure for me with test/multicqes_drain.t, which > > doesn't quite make sense to me (just yet), but it is a reliable > > timeout. > > OK, quick look and I think this is a bad assumption in the test case. > It's assuming that a POLL_ADD already succeeded, and hence that a > subsequent POLL_REMOVE will succeed. But now it's getting ENOENT as > we can't find it just yet, which means the cancelation itself isn't > being done. So we just end up waiting for something that doesn't > happen. > > Or could be an internal race with lookup/issue. In any case, it's > definitely being exposed by this patch. > That is a bit of an unpleasasnt test. Essentially it triggers a pipe, and reads from the pipe immediately after. The test expects to see a CQE for that trigger, however if anything ran asynchronously then there is a race between the read and the poll logic running. The attached patch fixes the test, but the reason my patches trigger it is a bit weird. This occurs on the second loop of the test, after the initial drain. Essentially ctx->drain_active is still true when the second set of polls are added, since drain_active is only cleared inside the next io_drain_req. So then the first poll will have REQ_F_FORCE_ASYNC set. Previously those FORCE_ASYNC's were being ignored, but now with "io_uring: if a linked request has REQ_F_FORCE_ASYNC then run it async" they get sent to the work thread, which causes the race. I wonder if drain_active should actually be cleared earlier? perhaps before setting the REQ_F_FORCE_ASYNC flag? The drain logic is pretty complex though, so I am not terribly keen to start changing it if it's not generally useful.
commit d362fb231310a52a79c8b9f72165a708bfd8aa44 Author: Dylan Yudaken <dylany@xxxxxxxx> Date: Mon Jan 30 01:49:57 2023 -0800 multicqes_drain: make trigger event wait before reading trigger_event is used to generate CQEs on the poll requests. However there is a race if that poll request is running asynchronously, where the read_pipe will complete before the poll is run, and the poll result will be that there is no data ready. Instead sleep and force an io_uring_get_events in order to give the poll a chance to run before reading from the pipe. Signed-off-by: Dylan Yudaken <dylany@xxxxxxxx> diff --git a/test/multicqes_drain.c b/test/multicqes_drain.c index 3755beec42c7..6c4d5f2ba887 100644 --- a/test/multicqes_drain.c +++ b/test/multicqes_drain.c @@ -71,13 +71,15 @@ static void read_pipe(int pipe) perror("read"); } -static int trigger_event(int p[]) +static int trigger_event(struct io_uring *ring, int p[]) { int ret; if ((ret = write_pipe(p[1], "foo")) != 3) { fprintf(stderr, "bad write return %d\n", ret); return 1; } + usleep(1000); + io_uring_get_events(ring); read_pipe(p[0]); return 0; } @@ -236,10 +238,8 @@ static int test_generic_drain(struct io_uring *ring) if (si[i].op != multi && si[i].op != single) continue; - if (trigger_event(pipes[i])) + if (trigger_event(ring, pipes[i])) goto err; - - io_uring_get_events(ring); } sleep(1); i = 0; @@ -317,13 +317,11 @@ static int test_simple_drain(struct io_uring *ring) } for (i = 0; i < 2; i++) { - if (trigger_event(pipe1)) + if (trigger_event(ring, pipe1)) goto err; - io_uring_get_events(ring); } - if (trigger_event(pipe2)) - goto err; - io_uring_get_events(ring); + if (trigger_event(ring, pipe2)) + goto err; for (i = 0; i < 2; i++) { sqe[i] = io_uring_get_sqe(ring);