Re: [PATCH 0/3][RFC] virtio-blk: add io_uring passthrough support for virtio-blk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 12/17/24 10:08 AM, Jason Wang wrote:
On Mon, Dec 16, 2024 at 8:07 PM Ferry Meng <mengferry@xxxxxxxxxxxxxxxxx> wrote:

On 12/16/24 3:38 PM, Jason Wang wrote:
On Mon, Dec 16, 2024 at 10:01 AM Ferry Meng <mengferry@xxxxxxxxxxxxxxxxx> wrote:
On 12/3/24 8:14 PM, Ferry Meng wrote:
We seek to develop a more flexible way to use virtio-blk and bypass the block
layer logic in order to accomplish certain performance optimizations. As a
result, we referred to the implementation of io_uring passthrough in NVMe
and implemented it in the virtio-blk driver. This patch series adds io_uring
passthrough support for virtio-blk devices, resulting in lower submit latency
and increased flexibility when utilizing virtio-blk.

To test this patch series, I changed fio's code:
1. Added virtio-blk support to engines/io_uring.c.
2. Added virtio-blk support to the t/io_uring.c testing tool.
Link: https://github.com/jdmfr/fio

Using t/io_uring-vblk, the performance of virtio-blk based on uring-cmd
scales better than block device access. (such as below, Virtio-Blk with QEMU,
1-depth fio)
(passthru) read: IOPS=17.2k, BW=67.4MiB/s (70.6MB/s)
slat (nsec): min=2907, max=43592, avg=3981.87, stdev=595.10
clat (usec): min=38, max=285,avg=53.47, stdev= 8.28
lat (usec): min=44, max=288, avg=57.45, stdev= 8.28
(block) read: IOPS=15.3k, BW=59.8MiB/s (62.7MB/s)
slat (nsec): min=3408, max=35366, avg=5102.17, stdev=790.79
clat (usec): min=35, max=343, avg=59.63, stdev=10.26
lat (usec): min=43, max=349, avg=64.73, stdev=10.21

Testing the virtio-blk device with fio using 'engines=io_uring_cmd'
and 'engines=io_uring' also demonstrates improvements in submit latency.
(passthru) taskset -c 0 t/io_uring-vblk -b4096 -d8 -c4 -s4 -p0 -F1 -B0 -O0 -n1 -u1 /dev/vdcc0
IOPS=189.80K, BW=741MiB/s, IOS/call=4/3
IOPS=187.68K, BW=733MiB/s, IOS/call=4/3
(block) taskset -c 0 t/io_uring-vblk -b4096 -d8 -c4 -s4 -p0 -F1 -B0 -O0 -n1 -u0 /dev/vdc
IOPS=101.51K, BW=396MiB/s, IOS/call=4/3
IOPS=100.01K, BW=390MiB/s, IOS/call=4/4

The performance overhead of submitting IO can be decreased by 25% overall
with this patch series. The implementation primarily references 'nvme io_uring
passthrough', supporting io_uring_cmd through a separate character interface
(temporarily named /dev/vdXc0). Since this is an early version, many
details need to be taken into account and redesigned, like:
● Currently, it only considers READ/WRITE scenarios, some more complex operations
not included like discard or zone ops.(Normal sqe64 is sufficient, in my opinion;
following upgrades, sqe128 and cqe32 might not be needed).
● ......

I would appreciate any useful recommendations.

Ferry Meng (3):
     virtio-blk: add virtio-blk chardev support.
     virtio-blk: add uring_cmd support for I/O passthru on chardev.
     virtio-blk: add uring_cmd iopoll support.

    drivers/block/virtio_blk.c      | 325 +++++++++++++++++++++++++++++++-
    include/uapi/linux/virtio_blk.h |  16 ++
    2 files changed, 336 insertions(+), 5 deletions(-)
Hi, Micheal & Jason :

What about yours' opinion? As virtio-blk maintainer. Looking forward to
your reply.

Thanks
If I understand this correctly, this proposal wants to make io_uring a
transport of the virito-blk command. So the application doesn't need
to worry about compatibility etc. This seems to be fine.

But I wonder what's the security consideration, for example do we
allow all virtio-blk commands to be passthroughs and why.
About 'security consideration', the generic char-dev belongs to root, so
only root can use this passthrough path.
This seems like a restriction. A lot of applications want to be run
without privilege to be safe.

I'm sorry that there may have been some misunderstanding in my previous explanation. The generic cdev file's default group is 'root,' but we can just use 'chgrp' and change it to what we want.

After which, apps can then utilize it, just like they would with a standard file.

On the other hand, to what I know, virtio-blk commands are all related
to 'I/O operations', so we can support all those opcodes with bypassing
vfs&block layer (if we want). I just realized the most  basic read/write
in this RFC patch series, others will be considered later.

Thanks

Thanks




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux