Hi all,
This series clean up the block polling code a bit and changes the interface
to poll for a specific bio instead of a request_queue and cookie pair.
Polling for the bio itself leads to a few advantages:
- the cookie construction can made entirely private in blk-mq.c
- the caller does not need to remember the request_queue and cookie
separately and thus sidesteps their lifetime issues
- keeping the device and the cookie inside the bio allows to trivially
support polling BIOs remapping by stacking drivers
- a lot of code to propagate the cookie back up the submission path can
removed entirely
The one major caveat is that this requires RCU freeing polled BIOs to make
sure the bio that contains the polling information is still alive when
io_uring tries to poll it through the iocb. For synchronous polling all the
callers have a bio reference anyway, so this is not an issue.
I ran this through the usual peak testing, and it doesn't seem to regress
anything for me. We're still at around ~7.4M polled IOPS on a single CPU
core:
taskset -c 0,16 t/io_uring -d128 -b512 -s32 -c32 -p1 -F1 -B1 -D1 -n2 /dev/nvme1n1 /dev/nvme2n1
Added file /dev/nvme1n1 (submitter 0)
Added file /dev/nvme2n1 (submitter 1)
polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128
Engine=io_uring, sq_ring=128, cq_ring=256
submitter=0, tid=1199
submitter=1, tid=1200
IOPS=7322112, BW=3575MiB/s, IOS/call=32/31, inflight=(110 71)
IOPS=7452736, BW=3639MiB/s, IOS/call=32/31, inflight=(52 80)
IOPS=7419904, BW=3623MiB/s, IOS/call=32/31, inflight=(78 104)
IOPS=7392576, BW=3609MiB/s, IOS/call=32/32, inflight=(75 102)
Jens, is that with nvme_core.multipath=Y ?