[RFC][PATCHSET 00/23] rework/optimise submission+completion paths

Pavel Begunkov <asml.silence@xxxxxxxxx> · Fri, 24 Sep 2021 17:31:38 +0100



tested with fio/t/io_uring nops all batching=32:
24 vs 31.5 MIOPS, or ~30% win

WARNING: there is one problem with draining, will fix in v2

There are two parts:
1-14 are about optimising the completion path:
- replaces lists with single linked lists
- kills 64 * 8B of caches in ctx
- adds some shuffling of iopoll bits
- list splice instead of per-req list_add in one place
- inlines io_req_free_batch() and other helpers

15-22: inlines __io_queue_sqe() so all the submission path
up to io_issue_sqe() is inlined + little tweaks


Pavel Begunkov (23):
  io_uring: mark having different creds unlikely
  io_uring: force_nonspin
  io_uring: make io_do_iopoll return number of reqs
  io_uring: use slist for completion batching
  io_uring: remove allocation cache array
  io-wq: add io_wq_work_node based stack
  io_uring: replace list with stack for req caches
  io_uring: split iopoll loop
  io_uring: use single linked list for iopoll
  io_uring: add a helper for batch free
  io_uring: convert iopoll_completed to store_release
  io_uring: optimise batch completion
  io_uring: inline completion batching helpers
  io_uring: don't pass tail into io_free_batch_list
  io_uring: don't pass state to io_submit_state_end
  io_uring: deduplicate io_queue_sqe() call sites
  io_uring: remove drain_active check from hot path
  io_uring: split slow path from io_queue_sqe
  io_uring: inline hot path of __io_queue_sqe()
  io_uring: reshuffle queue_sqe completion handling
  io_uring: restructure submit sqes to_submit checks
  io_uring: kill off ->inflight_entry field
  io_uring: comment why inline complete calls io_clean_op()

 fs/io-wq.h    |  60 +++++-
 fs/io_uring.c | 503 +++++++++++++++++++++++---------------------------
 2 files changed, 283 insertions(+), 280 deletions(-)

-- 
2.33.0