[PATCH RFC 00/17] playing around req alloc

Pavel Begunkov <asml.silence@xxxxxxxxx> · Wed, 10 Feb 2021 00:03:06 +0000

Unfolding previous ideas on persistent req caches. 4-7 including
slashed ~20% of overhead for nops benchmark, haven't done benchmarking
personally for this yet, but according to perf should be ~30-40% in
total. That's for IOPOLL + inline completion cases, obviously w/o
async/IRQ completions.

Jens,
1. 11/17 removes deallocations on end of submit_sqes. Looks you
forgot or just didn't do that.

2. lists are slow and not great cache-wise, that why at I want at least
a combined approach from 12/17.

3. Instead of lists in "use persistent request cache" I had in mind a
slightly different way: to grow the req alloc cache to 32-128 (or hint
from the userspace), batch-alloc by 8 as before, and recycle _all_ reqs
right into it. If  overflows, do kfree().
It should give probabilistically high hit rate, amortising out most of
allocations. Pros: it doesn't grow ~infinitely as lists can. Cons: there
are always counter examples. But as I don't have numbers to back it, I
took your implementation. Maybe, we'll reconsider later.

I'll revise tomorrow on a fresh head + do some performance testing,
and is leaving it RFC until then.

Jens Axboe (3):
  io_uring: use persistent request cache
  io_uring: provide FIFO ordering for task_work
  io_uring: enable req cache for task_work items

Pavel Begunkov (14):
  io_uring: replace force_nonblock with flags
  io_uring: make op handlers always take issue flags
  io_uring: don't propagate io_comp_state
  io_uring: don't keep submit_state on stack
  io_uring: remove ctx from comp_state
  io_uring: don't reinit submit state every time
  io_uring: replace list with array for compl batch
  io_uring: submit-completion free batching
  io_uring: remove fallback_req
  io_uring: count ctx refs separately from reqs
  io_uring: persistent req cache
  io_uring: feed reqs back into alloc cache
  io_uring: take comp_state from ctx
  io_uring: defer flushing cached reqs

 fs/io-wq.h               |   9 -
 fs/io_uring.c            | 716 ++++++++++++++++++++++-----------------
 include/linux/io_uring.h |  14 +
 3 files changed, 425 insertions(+), 314 deletions(-)

-- 
2.24.0