On 4/12/22 8:09 AM, Pavel Begunkov wrote: > nops benchmark: 40.3 -> 41.1 MIOPS, or +2% > > Pavel Begunkov (9): > io_uring: explicitly keep a CQE in io_kiocb > io_uring: memcpy CQE from req > io_uring: shrink final link flush > io_uring: inline io_flush_cached_reqs > io_uring: helper for empty req cache checks > io_uring: add helper to return req to cache list > io_uring: optimise submission loop invariant > io_uring: optimise submission left counting > io_uring: optimise io_get_cqe() > > fs/io_uring.c | 288 +++++++++++++++++++++++++++++--------------------- > 1 file changed, 165 insertions(+), 123 deletions(-) Get about ~4% on aarch64. I like both main changes, memcpy of cqe and the improvements to io_get_cqe(). -- Jens Axboe