On 9/23/22 15:19, Jens Axboe wrote:
On 9/23/22 7:53 AM, Pavel Begunkov wrote:
Overflowing CQEs may result in reordeing, which is buggy in case of
links, F_MORE and so.
Reported-by: Dylan Yudaken <dylany@xxxxxx>
Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
---
io_uring/io_uring.c | 12 ++++++++++--
io_uring/io_uring.h | 12 +++++++++---
2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index f359e24b46c3..62d1f55fde55 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -609,7 +609,7 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
io_cq_lock(ctx);
while (!list_empty(&ctx->cq_overflow_list)) {
- struct io_uring_cqe *cqe = io_get_cqe(ctx);
+ struct io_uring_cqe *cqe = io_get_cqe_overflow(ctx, true);
struct io_overflow_cqe *ocqe;
if (!cqe && !force)
@@ -736,12 +736,19 @@ bool io_req_cqe_overflow(struct io_kiocb *req)
* control dependency is enough as we're using WRITE_ONCE to
* fill the cq entry
*/
-struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx)
+struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow)
{
struct io_rings *rings = ctx->rings;
unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1);
unsigned int free, queued, len;
+ /*
+ * Posting into the CQ when there are pending overflowed CQEs may break
+ * ordering guarantees, which will affect links, F_MORE users and more.
+ * Force overflow the completion.
+ */
+ if (!overflow && (ctx->check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT)))
+ return NULL;
Rather than pass this bool around for the hot path, why not add a helper
for the case where 'overflow' isn't known? That can leave the regular
io_get_cqe() avoiding this altogether.
Was choosing from two ugly-ish solutions, but io_get_cqe() should be
inline and shouldn't really matter, but that's only the case in theory
though. If someone cleans up the CQE32 part and puts it into a separate
non-inline function, it'll be actually inlined.
--
Pavel Begunkov