On 9/2/22 10:06 AM, Jens Axboe wrote: > On 9/2/22 9:16 AM, Kanchan Joshi wrote: >> Hi, >> >> Currently uring-cmd lacks the ability to leverage the pre-registered >> buffers. This series adds the support in uring-cmd, and plumbs >> nvme passthrough to work with it. >> >> Using registered-buffers showed peak-perf hike from 1.85M to 2.17M IOPS >> in my setup. >> >> Without fixedbufs >> ***************** >> # taskset -c 0 t/io_uring -b512 -d128 -c32 -s32 -p0 -F1 -B0 -O0 -n1 -u1 /dev/ng0n1 >> submitter=0, tid=5256, file=/dev/ng0n1, node=-1 >> polled=0, fixedbufs=0/0, register_files=1, buffered=1, QD=128 >> Engine=io_uring, sq_ring=128, cq_ring=128 >> IOPS=1.85M, BW=904MiB/s, IOS/call=32/31 >> IOPS=1.85M, BW=903MiB/s, IOS/call=32/32 >> IOPS=1.85M, BW=902MiB/s, IOS/call=32/32 >> ^CExiting on signal >> Maximum IOPS=1.85M > > With the poll support queued up, I ran this one as well. tldr is: > > bdev (non pt) 122M IOPS > irq driven 51-52M IOPS > polled 71M IOPS > polled+fixed 78M IOPS > > Looking at profiles, it looks like the bio is still being allocated > and freed and not dipping into the alloc cache, which is using a > substantial amount of CPU. I'll poke a bit and see what's going on... It's using the fs_bio_set, and that doesn't have the PERCPU alloc cache enabled. With the below, we then do: polled+fixed 82M I suspect the remainder is due to the lack of batching on the request freeing side, at least some of it. Haven't really looked deeper yet. One issue I saw - try and use passthrough polling without having any poll queues defined and it'll stall just spinning on completions. You need to ensure that these are processed as well - look at how the non-passthrough io_uring poll path handles it. diff --git a/block/bio.c b/block/bio.c index 3d3a2678fea2..cba6b1c02eb8 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1754,7 +1754,7 @@ static int __init init_bio(void) cpuhp_setup_state_multi(CPUHP_BIO_DEAD, "block/bio:dead", NULL, bio_cpu_dead); - if (bioset_init(&fs_bio_set, BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS)) + if (bioset_init(&fs_bio_set, BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS | BIOSET_PERCPU_CACHE)) panic("bio: can't allocate bios\n"); if (bioset_integrity_create(&fs_bio_set, BIO_POOL_SIZE)) -- Jens Axboe