On 3/28/2024 5:08 AM, Jens Axboe wrote: > This patchset should look cleaner if you rebase it on top of the current > for-6.10/io_uring branch, as it gets rid of the async nastiness. Since > that'll need doing anyway, could you repost a v2 where it's rebased on > top of that? Yes, next iteration will use that as the base. > Also in terms of the cover letter, would be good with a bit more of a > description of what this enables. It's a bit scant on detail on what > exactly this gives you. Will fix that. But currently the only thing it gives is - pass meta buffer to/from the block-device. It keeps things simple, and fine for PI type 0 (normal unprotected IO). For other PI types, exposing few knobs may help. Using "sqe->rw_flags" if there is no other way. >> taskset -c 2,5 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n2 -r4 /dev/nvme0n1 /dev/nvme1n1 >> submitter=1, tid=2453, file=/dev/nvme1n1, node=-1 >> submitter=0, tid=2452, file=/dev/nvme0n1, node=-1 >> polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128 >> Engine=io_uring, sq_ring=128, cq_ring=128 >> IOPS=10.02M, BW=4.89GiB/s, IOS/call=31/31 >> IOPS=10.04M, BW=4.90GiB/s, IOS/call=31/31 >> >> With this: >> taskset -c 2,5 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n2 -r4 /dev/nvme0n1 /dev/nvme1n1 >> submitter=1, tid=2453, file=/dev/nvme1n1, node=-1 >> submitter=0, tid=2452, file=/dev/nvme0n1, node=-1 >> polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128 >> Engine=io_uring, sq_ring=128, cq_ring=128 >> IOPS=10.02M, BW=4.89GiB/s, IOS/call=31/31 >> IOPS=10.04M, BW=4.90GiB/s, IOS/call=31/31 > > Not that I don't believe you, but that looks like you pasted the same > stuff in there twice? It's the exact same perf and pids. Indeed :-( Made a goof-up while pasting stuff [1] to the cover letter. [1] Before the patch: # taskset -c 2,5 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n2 -r4 /dev/nvme0n1 /dev/nvme1n1 submitter=1, tid=2453, file=/dev/nvme1n1, node=-1 submitter=0, tid=2452, file=/dev/nvme0n1, node=-1 polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=10.02M, BW=4.89GiB/s, IOS/call=31/31 IOPS=10.04M, BW=4.90GiB/s, IOS/call=31/31 IOPS=10.04M, BW=4.90GiB/s, IOS/call=31/31 Exiting on timeout Maximum IOPS=10.04M After the patch: # taskset -c 2,5 t/io_uring -b512 -d128 -c32 -s32 -p1 -F1 -B1 -n2 -r4 /dev/nvme0n1 /dev/nvme1n1 submitter=1, tid=2412, file=/dev/nvme1n1, node=-1 submitter=0, tid=2411, file=/dev/nvme0n1, node=-1 polled=1, fixedbufs=1, register_files=1, buffered=0, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=10.02M, BW=4.89GiB/s, IOS/call=31/31 IOPS=10.03M, BW=4.90GiB/s, IOS/call=31/31 IOPS=10.04M, BW=4.90GiB/s, IOS/call=31/31 Exiting on timeout Maximum IOPS=10.04M