On Mon, Apr 25, 2022 at 11:25:26AM -0700, Stefan Roesch wrote:
This adds the overflow processing for large CQE's.
This adds two parameters to the io_cqring_event_overflow function and
uses these fields to initialize the large CQE fields.
Allocate enough space for large CQE's in the overflow structue. If no
large CQE's are used, the size of the allocation is unchanged.
The cqe field can have a different size depending if its a large
CQE or not. To be able to allocate different sizes, the two fields
in the structure are re-ordered.
Co-developed-by: Jens Axboe <axboe@xxxxxxxxx>
Signed-off-by: Stefan Roesch <shr@xxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
---
fs/io_uring.c | 31 ++++++++++++++++++++++---------
1 file changed, 22 insertions(+), 9 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 68b61d2b356d..3630671325ea 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -220,8 +220,8 @@ struct io_mapped_ubuf {
struct io_ring_ctx;
struct io_overflow_cqe {
- struct io_uring_cqe cqe;
struct list_head list;
+ struct io_uring_cqe cqe;
};
struct io_fixed_file {
@@ -2017,10 +2017,14 @@ static void io_cqring_ev_posted_iopoll(struct io_ring_ctx *ctx)
static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
{
bool all_flushed, posted;
+ size_t cqe_size = sizeof(struct io_uring_cqe);
if (!force && __io_cqring_events(ctx) == ctx->cq_entries)
return false;
+ if (ctx->flags & IORING_SETUP_CQE32)
+ cqe_size <<= 1;
+
posted = false;
spin_lock(&ctx->completion_lock);
while (!list_empty(&ctx->cq_overflow_list)) {
@@ -2032,7 +2036,7 @@ static bool __io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
ocqe = list_first_entry(&ctx->cq_overflow_list,
struct io_overflow_cqe, list);
if (cqe)
- memcpy(cqe, &ocqe->cqe, sizeof(*cqe));
+ memcpy(cqe, &ocqe->cqe, cqe_size);
Maybe a nit, but if we do it this way -
memcpy(cqe, &ocqe->cqe,
sizeof(*cqe) << (ctx->flags & IORING_SETUP_CQE32));
we can do away with all previous changes in this function.