On Wed, Sep 11, 2019 at 10:52:03AM +0800, Jason Wang wrote: > On 2019/9/11 上午9:47, Tiwei Bie wrote: > > On Tue, Sep 10, 2019 at 04:19:34PM +0800, Jason Wang wrote: > > > This path introduces a new mdev transport for virtio. This is used to > > > use kernel virtio driver to drive the mediated device that is capable > > > of populating virtqueue directly. > > > > > > A new virtio-mdev driver will be registered to the mdev bus, when a > > > new virtio-mdev device is probed, it will register the device with > > > mdev based config ops. This means, unlike the exist hardware > > > transport, this is a software transport between mdev driver and mdev > > > device. The transport was implemented through: > > > > > > - configuration access was implemented through parent_ops->read()/write() > > > - vq/config callback was implemented through parent_ops->ioctl() > > > > > > This transport is derived from virtio MMIO protocol and was wrote for > > > kernel driver. But for the transport itself, but the design goal is to > > > be generic enough to support userspace driver (this part will be added > > > in the future). > > > > > > Note: > > > - current mdev assume all the parameter of parent_ops was from > > > userspace. This prevents us from implementing the kernel mdev > > > driver. For a quick POC, this patch just abuse those parameter and > > > assume the mdev device implementation will treat them as kernel > > > pointer. This should be addressed in the formal series by extending > > > mdev_parent_ops. > > > - for a quick POC, I just drive the transport from MMIO, I'm pretty > > > there's lot of optimization space for this. > > > > > > Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx> > > > --- > > > drivers/vfio/mdev/Kconfig | 7 + > > > drivers/vfio/mdev/Makefile | 1 + > > > drivers/vfio/mdev/virtio_mdev.c | 500 +++++++++++++++++++++++++++++++ > > > include/uapi/linux/virtio_mdev.h | 131 ++++++++ > > > 4 files changed, 639 insertions(+) > > > create mode 100644 drivers/vfio/mdev/virtio_mdev.c > > > create mode 100644 include/uapi/linux/virtio_mdev.h > > > > > [...] > > > diff --git a/include/uapi/linux/virtio_mdev.h b/include/uapi/linux/virtio_mdev.h > > > new file mode 100644 > > > index 000000000000..8040de6b960a > > > --- /dev/null > > > +++ b/include/uapi/linux/virtio_mdev.h > > > @@ -0,0 +1,131 @@ > > > +/* > > > + * Virtio mediated device driver > > > + * > > > + * Copyright 2019, Red Hat Corp. > > > + * > > > + * Based on Virtio MMIO driver by ARM Ltd, copyright ARM Ltd. 2011 > > > + * > > > + * This header is BSD licensed so anyone can use the definitions to implement > > > + * compatible drivers/servers. > > > + * > > > + * Redistribution and use in source and binary forms, with or without > > > + * modification, are permitted provided that the following conditions > > > + * are met: > > > + * 1. Redistributions of source code must retain the above copyright > > > + * notice, this list of conditions and the following disclaimer. > > > + * 2. Redistributions in binary form must reproduce the above copyright > > > + * notice, this list of conditions and the following disclaimer in the > > > + * documentation and/or other materials provided with the distribution. > > > + * 3. Neither the name of IBM nor the names of its contributors > > > + * may be used to endorse or promote products derived from this software > > > + * without specific prior written permission. > > > + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND > > > + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > > > + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > > > + * ARE DISCLAIMED. IN NO EVENT SHALL IBM OR CONTRIBUTORS BE LIABLE > > > + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > > > + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > > > + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > > > + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > > > + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > > > + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > > > + * SUCH DAMAGE. > > > + */ > > > +#ifndef _LINUX_VIRTIO_MDEV_H > > > +#define _LINUX_VIRTIO_MDEV_H > > > + > > > +#include <linux/interrupt.h> > > > +#include <linux/vringh.h> > > > +#include <uapi/linux/virtio_net.h> > > > + > > > +/* > > > + * Ioctls > > > + */ > > > + > > > +struct virtio_mdev_callback { > > > + irqreturn_t (*callback)(void *); > > > + void *private; > > > +}; > > > + > > > +#define VIRTIO_MDEV 0xAF > > > +#define VIRTIO_MDEV_SET_VQ_CALLBACK _IOW(VIRTIO_MDEV, 0x00, \ > > > + struct virtio_mdev_callback) > > > +#define VIRTIO_MDEV_SET_CONFIG_CALLBACK _IOW(VIRTIO_MDEV, 0x01, \ > > > + struct virtio_mdev_callback) > > > + > > > +#define VIRTIO_MDEV_DEVICE_API_STRING "virtio-mdev" > > > + > > > +/* > > > + * Control registers > > > + */ > > > + > > > +/* Magic value ("virt" string) - Read Only */ > > > +#define VIRTIO_MDEV_MAGIC_VALUE 0x000 > > > + > > > +/* Virtio device version - Read Only */ > > > +#define VIRTIO_MDEV_VERSION 0x004 > > > + > > > +/* Virtio device ID - Read Only */ > > > +#define VIRTIO_MDEV_DEVICE_ID 0x008 > > > + > > > +/* Virtio vendor ID - Read Only */ > > > +#define VIRTIO_MDEV_VENDOR_ID 0x00c > > > + > > > +/* Bitmask of the features supported by the device (host) > > > + * (32 bits per set) - Read Only */ > > > +#define VIRTIO_MDEV_DEVICE_FEATURES 0x010 > > > + > > > +/* Device (host) features set selector - Write Only */ > > > +#define VIRTIO_MDEV_DEVICE_FEATURES_SEL 0x014 > > > + > > > +/* Bitmask of features activated by the driver (guest) > > > + * (32 bits per set) - Write Only */ > > > +#define VIRTIO_MDEV_DRIVER_FEATURES 0x020 > > > + > > > +/* Activated features set selector - Write Only */ > > > +#define VIRTIO_MDEV_DRIVER_FEATURES_SEL 0x024 > > > + > > > +/* Queue selector - Write Only */ > > > +#define VIRTIO_MDEV_QUEUE_SEL 0x030 > > > + > > > +/* Maximum size of the currently selected queue - Read Only */ > > > +#define VIRTIO_MDEV_QUEUE_NUM_MAX 0x034 > > > + > > > +/* Queue size for the currently selected queue - Write Only */ > > > +#define VIRTIO_MDEV_QUEUE_NUM 0x038 > > > + > > > +/* Ready bit for the currently selected queue - Read Write */ > > > +#define VIRTIO_MDEV_QUEUE_READY 0x044 > > > + > > > +/* Alignment of virtqueue - Read Only */ > > > +#define VIRTIO_MDEV_QUEUE_ALIGN 0x048 > > > + > > > +/* Queue notifier - Write Only */ > > > +#define VIRTIO_MDEV_QUEUE_NOTIFY 0x050 > > > + > > > +/* Device status register - Read Write */ > > > +#define VIRTIO_MDEV_STATUS 0x060 > > > + > > > +/* Selected queue's Descriptor Table address, 64 bits in two halves */ > > > +#define VIRTIO_MDEV_QUEUE_DESC_LOW 0x080 > > > +#define VIRTIO_MDEV_QUEUE_DESC_HIGH 0x084 > > > + > > > +/* Selected queue's Available Ring address, 64 bits in two halves */ > > > +#define VIRTIO_MDEV_QUEUE_AVAIL_LOW 0x090 > > > +#define VIRTIO_MDEV_QUEUE_AVAIL_HIGH 0x094 > > > + > > > +/* Selected queue's Used Ring address, 64 bits in two halves */ > > > +#define VIRTIO_MDEV_QUEUE_USED_LOW 0x0a0 > > > +#define VIRTIO_MDEV_QUEUE_USED_HIGH 0x0a4 > > > + > > > +/* Configuration atomicity value */ > > > +#define VIRTIO_MDEV_CONFIG_GENERATION 0x0fc > > > + > > > +/* The config space is defined by each driver as > > > + * the per-driver configuration space - Read Write */ > > > +#define VIRTIO_MDEV_CONFIG 0x100 > > IIUC, we can use above registers with virtio-mdev parent's > > read()/write() to access the mdev device from kernel driver. > > As you suggested, it's a choice to build vhost-mdev on top > > of this abstraction as well. But virtio is the frontend > > device which lacks some vhost backend features, e.g. get > > vring base, set vring base, negotiate vhost features, etc. > > So I'm wondering, does it make sense to reserve some space > > for vhost-mdev in kernel to do vhost backend specific setups? > > Or do you have any other thoughts? > > > Good point. It's just a quick POC, I plan to add vhost features in the next > release: > > 1) set/get virtqueue state (e.g vring base) > 2) dirty page tracking > 3) for vhost features, if you mean _F_LOG_ALL, it could be done by 2), for > IOTLB, it requires more thought on the API since current IOTLB API is no > implemented through ioctl ... > > > > > > Besides, I'm also wondering, what's the purpose of making > > above registers part of UAPI? > > > Sorry if it brings confusion. If we do vhost-mdev on top of this, there's no > need for this to go for UAPI. > > > > And if we make them part > > of UAPI, do we also need to make them part of virtio spec? > > > Yes, but I do not think we should put it into UAPI. Technically, userspace > can use this transport as well with some modification on the interrupt part, > e.g using eventfd. But is there any value for doing this instead of vhost? Yeah, I agree. As we already have vhost, I don't see the value to make virtio become "virtio + vhost". Thanks, Tiwei