On Mon, 05 Dec 2022 02:44:24 +0000, Pavel Begunkov wrote: > Optimise CQ locking for event posting depending on a number of ring setup flags. > QD1 nop benchmark showed 12.067 -> 12.565 MIOPS increase, which more than 8.5% > of the io_uring kernel overhead (taking into account that the syscall overhead > is over 50%) or 4.12% of the total performance. Naturally, it's not only about > QD1, applications can submit a bunch of requests but their completions will may > arrive randomly hurting batching and so performance (or latency). > > [...] Applied, thanks! [1/7] io_uring: skip overflow CQE posting for dying ring commit: 3dac93b1fae0b90211ed50fac8c2b48df1fc01dc [2/7] io_uring: don't check overflow flush failures commit: a3f63209455a1d453ee8d9b87d0e07971b3c356e [3/7] io_uring: complete all requests in task context commit: ab857514be26e0050e29696f363a96d238d8817e [4/7] io_uring: force multishot CQEs into task context commit: 6db5fe86590f68c69747e8d5a3190b710e36ffb2 [5/7] io_uring: post msg_ring CQE in task context commit: d9143438fdccc62eb31a0985caa00c2876f8aa75 [6/7] io_uring: use tw for putting rsrc commit: 3a65f4413a2ccd362227c7d121ef549aa5a92b46 [7/7] io_uring: skip spinlocking for ->task_complete commit: 65a52cc3de9d7a93aa4c52a4a03e4a91ad7d1943 Best regards, -- Jens Axboe