On Tue, Sep 17, 2019 at 03:26:30PM +0800, Jason Wang wrote: > On 2019/9/17 上午9:02, Tiwei Bie wrote: > > diff --git a/drivers/vhost/mdev.c b/drivers/vhost/mdev.c > > new file mode 100644 > > index 000000000000..8c6597aff45e > > --- /dev/null > > +++ b/drivers/vhost/mdev.c > > @@ -0,0 +1,462 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * Copyright (C) 2018-2019 Intel Corporation. > > + */ > > + > > +#include <linux/compat.h> > > +#include <linux/kernel.h> > > +#include <linux/miscdevice.h> > > +#include <linux/mdev.h> > > +#include <linux/module.h> > > +#include <linux/vfio.h> > > +#include <linux/vhost.h> > > +#include <linux/virtio_mdev.h> > > + > > +#include "vhost.h" > > + > > +struct vhost_mdev { > > + struct mutex mutex; > > + struct vhost_dev dev; > > + struct vhost_virtqueue *vqs; > > + int nvqs; > > + u64 state; > > + u64 features; > > + u64 acked_features; > > + struct vfio_group *vfio_group; > > + struct vfio_device *vfio_device; > > + struct mdev_device *mdev; > > +}; > > + > > +/* > > + * XXX > > + * We assume virtio_mdev.ko exposes below symbols for now, as we > > + * don't have a proper way to access parent ops directly yet. > > + * > > + * virtio_mdev_readl() > > + * virtio_mdev_writel() > > + */ > > +extern u32 virtio_mdev_readl(struct mdev_device *mdev, loff_t off); > > +extern void virtio_mdev_writel(struct mdev_device *mdev, loff_t off, u32 val); > > > Need to consider a better approach, I feel we should do it through some kind > of mdev driver instead of talk to mdev device directly. Yeah, a better approach is really needed here. Besides, we may want a way to allow accessing the mdev device_ops proposed in below series outside the drivers/vfio/mdev/ directory. https://lkml.org/lkml/2019/9/12/151 I.e. allow putting mdev drivers outside above directory. > > + > > + for (queue_id = 0; queue_id < m->nvqs; queue_id++) { > > + vq = &m->vqs[queue_id]; > > + > > + if (!vq->desc || !vq->avail || !vq->used) > > + break; > > + > > + virtio_mdev_writel(mdev, VIRTIO_MDEV_QUEUE_NUM, vq->num); > > + > > + if (!vhost_translate_ring_addr(vq, (u64)vq->desc, > > + vhost_get_desc_size(vq, vq->num), > > + &addr)) > > + return -EINVAL; > > > Interesting, any reason for doing such kinds of translation to HVA? I > believe the add should already an IOVA that has been map by VFIO. Currently, in the software based vhost-kernel and vhost-user backends, QEMU will pass ring addresses as HVA in SET_VRING_ADDR ioctl when iotlb isn't enabled. If it's OK to let QEMU pass GPA in vhost-mdev in this case, then this translation won't be needed. Thanks, Tiwei