Currently we drop completion events, if the CQ ring is full. That's fine for requests with bounded completion times, but it may make it harder to use io_uring with networked IO where request completion times are generally unbounded. Or with POLL, for example, which is also unbounded. This patch adds IORING_SETUP_CQ_NODROP, which changes the behavior a bit for CQ ring overflows. First of all, it doesn't overflow the ring, it simply stores backlog of completions that we weren't able to put into the CQ ring. To prevent the backlog from growing indefinitely, if the backlog is non-empty, we apply back pressure on IO submissions. Any attempt to submit new IO with a non-empty backlog will get an -EBUSY return from the kernel. I think that makes for a pretty sane API in terms of how the application can handle it. With CQ_NODROP enabled, we'll never drop a completion event, but we'll also not allow submissions with a completion backlog. Changes since v1: - Drop the cqe_drop structure and allocation, simply use the io_kiocb for the overflow backlog - Rebase on top of Pavel's series which made this cleaner - Add prep patch for the fill/add CQ handler changes fs/io_uring.c | 203 +++++++++++++++++++++++----------- include/uapi/linux/io_uring.h | 1 + 2 files changed, 138 insertions(+), 66 deletions(-) -- Jens Axboe