On 9/23/22 8:43 AM, Pavel Begunkov wrote: > On 9/23/22 15:35, Jens Axboe wrote: >> On 9/23/22 8:26 AM, Pavel Begunkov wrote: >>> On 9/23/22 15:19, Jens Axboe wrote: >>>> On 9/23/22 7:53 AM, Pavel Begunkov wrote: >>>>> Overflowing CQEs may result in reordeing, which is buggy in case of >>>>> links, F_MORE and so. >>>>> >>>>> Reported-by: Dylan Yudaken <dylany@xxxxxx> >>>>> Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> >>>>> --- >>>>> io_uring/io_uring.c | 12 ++++++++++-- >>>>> io_uring/io_uring.h | 12 +++++++++--- >>>>> 2 files changed, 19 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c >>>>> index f359e24b46c3..62d1f55fde55 100644 >>>>> --- a/io_uring/io_uring.c >>>>> +++ b/io_uring/io_uring.c >>>>> @@ -609,7 +609,7 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force) >>>>> io_cq_lock(ctx); >>>>> while (!list_empty(&ctx->cq_overflow_list)) { >>>>> - struct io_uring_cqe *cqe = io_get_cqe(ctx); >>>>> + struct io_uring_cqe *cqe = io_get_cqe_overflow(ctx, true); >>>>> struct io_overflow_cqe *ocqe; >>>>> if (!cqe && !force) >>>>> @@ -736,12 +736,19 @@ bool io_req_cqe_overflow(struct io_kiocb *req) >>>>> * control dependency is enough as we're using WRITE_ONCE to >>>>> * fill the cq entry >>>>> */ >>>>> -struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx) >>>>> +struct io_uring_cqe *__io_get_cqe(struct io_ring_ctx *ctx, bool overflow) >>>>> { >>>>> struct io_rings *rings = ctx->rings; >>>>> unsigned int off = ctx->cached_cq_tail & (ctx->cq_entries - 1); >>>>> unsigned int free, queued, len; >>>>> + /* >>>>> + * Posting into the CQ when there are pending overflowed CQEs may break >>>>> + * ordering guarantees, which will affect links, F_MORE users and more. >>>>> + * Force overflow the completion. >>>>> + */ >>>>> + if (!overflow && (ctx->check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT))) >>>>> + return NULL; >>>> >>>> Rather than pass this bool around for the hot path, why not add a helper >>>> for the case where 'overflow' isn't known? That can leave the regular >>>> io_get_cqe() avoiding this altogether. >>> >>> Was choosing from two ugly-ish solutions, but io_get_cqe() should be >>> inline and shouldn't really matter, but that's only the case in theory >>> though. If someone cleans up the CQE32 part and puts it into a separate >>> non-inline function, it'll be actually inlined. >> >> Yes, in theory the current one will be fine as it's known at compile >> time. In theory... Didn't check if practice agrees with that, would >> prefer if we didn't leave this to the compiler. Fiddling some other >> bits, will check in a bit if I have a better idea. > > When inline constants are propagated to the moment they're needed, > no sane compiler will do otherwise, that's one of the most basic > optimisations. Don't think it's sane not relying on that. Yeah it's probably fine as-is, I'd expect it to as well for sure.-- Jens Axboe