On 9/24/21 2:59 PM, Pavel Begunkov wrote: > 24 MIOPS vs 31.5, or ~30% win for fio/t/io_uring nops batching=32 > Jens mentioned that with his standard test against Optane it gave > yet another +3% to throughput. > > 1-14 are about optimising the completion path: > - replaces lists with single linked lists > - kills 64 * 8B of caches in ctx > - adds some shuffling of iopoll bits > - list splice instead of per-req list_add in one place > - inlines io_req_free_batch() and other helpers > > 15-22: inlines __io_queue_sqe() so all the submission path > up to io_issue_sqe() is inlined + little tweaks Applied for 5.16, thanks! -- Jens Axboe