On 12/4/21 23:20, Jens Axboe wrote:
On 12/4/21 3:48 PM, Pavel Begunkov wrote:
On 12/4/21 22:21, Jens Axboe wrote:
On 12/4/21 1:49 PM, Pavel Begunkov wrote:
Currently, IOPOLL returns the number of completed requests, but with
REQ_F_CQE_SKIP there are not the same thing anymore. That may be
confusing as non-iopoll wait cares only about CQEs, so make io_do_iopoll
return the number of posted CQEs.
Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
---
fs/io_uring.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 64add8260abb..ea7a0daa0b3b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2538,10 +2538,10 @@ static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
/* order with io_complete_rw_iopoll(), e.g. ->result updates */
if (!smp_load_acquire(&req->iopoll_completed))
break;
+ if (unlikely(req->flags & REQ_F_CQE_SKIP))
+ continue;
- if (!(req->flags & REQ_F_CQE_SKIP))
- __io_fill_cqe(ctx, req->user_data, req->result,
- io_put_kbuf(req));
+ __io_fill_cqe(ctx, req->user_data, req->result, io_put_kbuf(req));
nr_events++;
}
Not sure I follow the logic behind this change. Places like
io_iopoll_try_reap_events() just need a "did we find anything" return,
which is independent on whether or not we actually posted CQEs or not.
Other callers either don't care what the return value is or if it's < 0
or not (which this change won't affect).
I feel like I'm missing something here, or that the commit message
better needs to explain why this change is done.
I was wrong on how I described it, but it means that the problem is in
a different place.
int io_do_iopoll() {
return nr_events;
}
int io_iopoll_check() {
do {
nr_events += io_do_iopoll();
while (nr_events < min && ...);
}
And "events" there better to be CQEs, otherwise the semantics
of @min + CQE_SKIP is not very clear and mismatches non-IOPOLL.
Can you do a v2 of this patch? Rest of them look good to me.
I'll send it tomorrow with clarifications added
--
Pavel Begunkov