On 9/4/22 11:01 AM, Kanchan Joshi wrote: > On Sat, Sep 03, 2022 at 11:00:43AM -0600, Jens Axboe wrote: >> On 9/2/22 3:25 PM, Jens Axboe wrote: >>> On 9/2/22 1:32 PM, Jens Axboe wrote: >>>> On 9/2/22 12:46 PM, Kanchan Joshi wrote: >>>>> On Fri, Sep 02, 2022 at 10:32:16AM -0600, Jens Axboe wrote: >>>>>> On 9/2/22 10:06 AM, Jens Axboe wrote: >>>>>>> On 9/2/22 9:16 AM, Kanchan Joshi wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> Currently uring-cmd lacks the ability to leverage the pre-registered >>>>>>>> buffers. This series adds the support in uring-cmd, and plumbs >>>>>>>> nvme passthrough to work with it. >>>>>>>> >>>>>>>> Using registered-buffers showed peak-perf hike from 1.85M to 2.17M IOPS >>>>>>>> in my setup. >>>>>>>> >>>>>>>> Without fixedbufs >>>>>>>> ***************** >>>>>>>> # taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B0 -O0 -n1 -u1 /dev/ng0n1 >>>>>>>> submitter=0, tid=5256, file=/dev/ng0n1, node=-1 >>>>>>>> polled=0, fixedbufs=0/0, register_files=1, buffered=1, QD=128 >>>>>>>> Engine=io_uring, sq_ring=128, cq_ring=128 >>>>>>>> IOPS=1.85M, BW=904MiB/s, IOS/call=32/31 >>>>>>>> IOPS=1.85M, BW=903MiB/s, IOS/call=32/32 >>>>>>>> IOPS=1.85M, BW=902MiB/s, IOS/call=32/32 >>>>>>>> ^CExiting on signal >>>>>>>> Maximum IOPS=1.85M >>>>>>> >>>>>>> With the poll support queued up, I ran this one as well. tldr is: >>>>>>> >>>>>>> bdev (non pt)??? 122M IOPS >>>>>>> irq driven??? 51-52M IOPS >>>>>>> polled??????? 71M IOPS >>>>>>> polled+fixed??? 78M IOPS >> >> Followup on this, since t/io_uring didn't correctly detect NUMA nodes >> for passthrough. >> >> With the current tree and the patchset I just sent for iopoll and the >> caching fix that's in the block tree, here's the final score: >> >> polled+fixed passthrough??? 105M IOPS >> >> which is getting pretty close to the bdev polled fixed path as well. >> I think that is starting to look pretty good! > Great! In my setup (single disk/numa-node), current kernel shows- > > Block MIOPS > *********** > command:t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B0 -P1 -n1 /dev/nvme0n1 > plain: 1.52 > plain+fb: 1.77 > plain+poll: 2.23 > plain+fb+poll: 2.61 > > Passthru MIOPS > ************** > command:t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B0 -O0 -P1 -u1 -n1 /dev/ng0n1 > plain: 1.78 > plain+fb: 2.08 > plain+poll: 2.21 > plain+fb+poll: 2.69 Interesting, here's what I have: Block MIOPS ============ plain: 2.90 plain+fb: 3.0 plain+poll: 4.04 plain+fb+poll: 5.09 Passthru MIPS ============= plain: 2.37 plain+fb: 2.84 plain+poll: 3.65 plain+fb+poll: 4.93 This is a gen2 optane, it maxes out at right around 5.1M IOPS. Note that I have disabled iostats and merges generally in my runs: echo 0 > /sys/block/nvme0n1/queue/iostats echo 2 > /sys/block/nvme0n1/queue/nomerges which will impact block more than passthru obviously, particularly the nomerges. iostats should have a similar impact on both of them (but I haven't tested either of those without those disabled). -- Jens Axboe