On 10/12/21 5:12 AM, Christoph Hellwig wrote: > Hi all, > > This series clean up the block polling code a bit and changes the interface > to poll for a specific bio instead of a request_queue and cookie pair. > > Polling for the bio itself leads to a few advantages: > > - the cookie construction can made entirely private in blk-mq.c > - the caller does not need to remember the request_queue and cookie > separately and thus sidesteps their lifetime issues > - keeping the device and the cookie inside the bio allows to trivially > support polling BIOs remapping by stacking drivers > - a lot of code to propagate the cookie back up the submission path can > removed entirely > > The one major caveat is that this requires RCU freeing polled BIOs to make > sure the bio that contains the polling information is still alive when > io_uring tries to poll it through the iocb. For synchronous polling all the > callers have a bio reference anyway, so this is not an issue. I ran this through the usual peak testing, and it doesn't seem to regress anything for me. We're still at around ~7.4M polled IOPS on a single CPU core: taskset -c 0,16 t/io_uring -d128 -b512 -s32 -c32 -p1 -F1 -B1 -D1 -n2 /dev/nvme1n1 /dev/nvme2n1 Added file /dev/nvme1n1 (submitter 0) Added file /dev/nvme2n1 (submitter 1) polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128 Engine=io_uring, sq_ring=128, cq_ring=256 submitter=0, tid=1199 submitter=1, tid=1200 IOPS=7322112, BW=3575MiB/s, IOS/call=32/31, inflight=(110 71) IOPS=7452736, BW=3639MiB/s, IOS/call=32/31, inflight=(52 80) IOPS=7419904, BW=3623MiB/s, IOS/call=32/31, inflight=(78 104) IOPS=7392576, BW=3609MiB/s, IOS/call=32/32, inflight=(75 102) with some of my pending changes and hacks. Using IRQ mode, we're at around 4.9M and I don't see any particular impact of needing deferred RCU free of the bio for that case. -- Jens Axboe