On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote: > On 2019/7/3 下午5:13, Tiwei Bie wrote: > > Details about this can be found here: > > > > https://lwn.net/Articles/750770/ > > > > What's new in this version > > ========================== > > > > A new VFIO device type is introduced - vfio-vhost. This addressed > > some comments from here: https://patchwork.ozlabs.org/cover/984763/ > > > > Below is the updated device interface: > > > > Currently, there are two regions of this device: 1) CONFIG_REGION > > (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the > > device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which > > can be used to notify the device. > > > > 1. CONFIG_REGION > > > > The region described by CONFIG_REGION is the main control interface. > > Messages will be written to or read from this region. > > > > The message type is determined by the `request` field in message > > header. The message size is encoded in the message header too. > > The message format looks like this: > > > > struct vhost_vfio_op { > > __u64 request; > > __u32 flags; > > /* Flag values: */ > > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */ > > __u32 size; > > union { > > __u64 u64; > > struct vhost_vring_state state; > > struct vhost_vring_addr addr; > > } payload; > > }; > > > > The existing vhost-kernel ioctl cmds are reused as the message > > requests in above structure. > > > Still a comments like V1. What's the advantage of inventing a new protocol? I'm trying to make it work in VFIO's way.. > I believe either of the following should be better: > > - using vhost ioctl, we can start from SET_VRING_KICK/SET_VRING_CALL and > extend it with e.g notify region. The advantages is that all exist userspace > program could be reused without modification (or minimal modification). And > vhost API hides lots of details that is not necessary to be understood by > application (e.g in the case of container). Do you mean reusing vhost's ioctl on VFIO device fd directly, or introducing another mdev driver (i.e. vhost_mdev instead of using the existing vfio_mdev) for mdev device? > > - using PCI layout, then you don't even need to re-invent notifiy region at > all and we can pass-through them to guest. Like what you said previously, virtio has transports other than PCI. And it will look a bit odd when using transports other than PCI.. > > Personally, I prefer vhost ioctl. +1 > > > > [...] > > > > 3. VFIO interrupt ioctl API > > > > VFIO interrupt ioctl API is used to setup device interrupts. > > IRQ-bypass can also be supported. > > > > Currently, the data path interrupt can be configured via the > > VFIO_VHOST_VQ_IRQ_INDEX with virtqueue's callfd. > > > How about DMA API? Do you expect to use VFIO IOMMU API or using vhost > SET_MEM_TABLE? VFIO IOMMU API is more generic for sure but with > SET_MEM_TABLE DMA can be done at the level of parent device which means it > can work for e.g the card with on-chip IOMMU. Agree. In this RFC, it assumes userspace will use VFIO IOMMU API to do the DMA programming. But like what you said, there could be a problem when using cards with on-chip IOMMU. > > And what's the plan for vIOMMU? As this RFC assumes userspace will use VFIO IOMMU API, userspace just needs to follow the same way like what vfio-pci device does in QEMU to support vIOMMU. > > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@xxxxxxxxx> > > --- > > drivers/vhost/Makefile | 2 + > > drivers/vhost/vdpa.c | 770 +++++++++++++++++++++++++++++++++++++ > > include/linux/vdpa_mdev.h | 72 ++++ > > include/uapi/linux/vfio.h | 19 + > > include/uapi/linux/vhost.h | 25 ++ > > 5 files changed, 888 insertions(+) > > create mode 100644 drivers/vhost/vdpa.c > > create mode 100644 include/linux/vdpa_mdev.h > > > We probably need some sample parent device implementation. It could be a > software datapath like e.g we can start from virtio-net device in guest or a > vhost/tap on host. Yeah, something like this would be interesting! Thanks, Tiwei > > Thanks > > > >