On Mon, Feb 03, 2025 at 07:45:11AM -0800, Keith Busch wrote: > From: Keith Busch <kbusch@xxxxxxxxxx> > > This is a new look at supporting zero copy with ublk. Just to give some numbers behind this since I didn't in the original cover letter. The numbers are from a 1.6GHz Xeon platform. Using the ublksrv patch I provided in the cover letter, created two ublk block devices with null_blk backings: ublk add -t loop -f /dev/nullb0 ublk add -t loop -f /dev/nullb1 -z Using t/io_uring, comparing the ublk device without zero-copy vs the one with zero-copy (-z) enabled 4k read: Legacy: IOPS=387.78K, BW=1514MiB/s, IOS/call=32/32 IOPS=395.14K, BW=1543MiB/s, IOS/call=32/32 IOPS=395.68K, BW=1545MiB/s, IOS/call=32/31 Zero-copy: IOPS=482.69K, BW=1885MiB/s, IOS/call=32/31 IOPS=481.34K, BW=1880MiB/s, IOS/call=32/32 IOPS=481.66K, BW=1881MiB/s, IOS/call=32/32 64k read: Legacy: IOPS=73248, BW=4.58GiB/s, IOS/call=32/32 IOPS=73664, BW=4.60GiB/s, IOS/call=32/32 IOPS=72288, BW=4.52GiB/s, IOS/call=32/32 Zero-copy: IOPS=381.76K, BW=23.86GiB/s, IOS/call=32/31 IOPS=378.18K, BW=23.64GiB/s, IOS/call=32/32 IOPS=379.52K, BW=23.72GiB/s, IOS/call=32/32 The register/unregister overhead is low enough to show a decent improvement even at 4k IO. And it's using half the memory with lower CPU utilization per IO, so all good wins.