[PATCH 0/8] second part of 5.12 patches

Pavel Begunkov <asml.silence@xxxxxxxxx> · Mon, 25 Jan 2021 11:42:19 +0000

1,2 are simple, can be considered separately

3-8 are inline completion optimisations, should affect buffered rw,
recv/send and others that can complete inline.

fio/t/io_uring do_nop=1 benchmark (batch=32) in KIOPS:
baseline (1-5 applied):         qd32: 8001,   qd1: 2015
arrays (+6/8):                  qd32: 8128,   qd1: 2028
batching (+7/8):                qd32: 10300,  qd1: 1946

The downside is worse qd1 with batching, don't think we should care much
because most of the time is syscalling, and I can easily get ~15-30% and
5-10% for qd32 and qd1 respectively by making ring's allocation cache
persistent and feeding memory of inline executed requests back into it.
Note: this should not affect async executed requests, e.g. block rw,
because they never hit this path.

Pavel Begunkov (8):
  io_uring: ensure only sqo_task has file notes
  io_uring: consolidate putting reqs task
  io_uring: don't keep submit_state on stack
  io_uring: remove ctx from comp_state
  io_uring: don't reinit submit state every time
  io_uring: replace list with array for compl batch
  io_uring: submit-completion free batching
  io_uring: keep interrupts on on submit completion

 fs/io_uring.c | 221 +++++++++++++++++++++++++-------------------------
 1 file changed, 110 insertions(+), 111 deletions(-)

-- 
2.24.0