On 2020/6/29 下午11:49, Michael S. Tsirkin wrote:
On Mon, Jun 29, 2020 at 05:26:03PM +0800, Jason Wang wrote:On 2020/6/28 下午5:58, Michael S. Tsirkin wrote:On Thu, Jun 18, 2020 at 01:56:25PM +0800, Jason Wang wrote:This patches extend the vhost IOTLB API to accept batch updating hints form userspace. When userspace wants update the device IOTLB in a batch, it may do: 1) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_BEGIN flag 2) Perform a batch of IOTLB updating via VHOST_IOTLB_UPDATE/INVALIDATE 3) Write vhost_iotlb_msg with VHOST_IOTLB_BATCH_END flagAs long as we are extending the interface, is there some way we could cut down the number of system calls needed here?I'm not sure it's worth to do that since usually we only have less than 10 regions.Well with a guest iommu I'm guessing it can go up significantly, right?
The batching flag is not mandatory, so userspace can still choose to update a single IOTLB mapping through a single system call without batch flag.
A possible method is to carry multiple vhost_iotlb_message in one system call.That's an option. Also, can kernel handle a batch that is too large by applying parts of it iteratively?
I think so, we can iterate a vhost iotlb message one by one through iov iterator.
Or must all changes take place atomically after BATCH_END?
For changes: - if you mean the changes in vhost IOTLB, it's not atomically- if you mean the mapping seen by vDPA device driver, it could be atomically for the device driver that implements set_map() method, since we pass a interval tree to the device.
If atomically, we might need to limit batch size to make sure kernel can keep a copy in memory.
It's the responsibility of guest driver to make sure there won't be a race condition between DMA and map updating. So it looks to me we don't need to care about that.
Vhost-vdpa may decide to batch the IOMMU/IOTLB updating in step 3 when vDPA device support set_map() ops. This is useful for the vDPA device that want to know all the mappings to tweak their own DMA translation logic. For vDPA device that doesn't require set_map(), no behavior changes. This capability is advertised via VHOST_BACKEND_F_IOTLB_BATCH capability. Signed-off-by: Jason Wang <jasowang@xxxxxxxxxx> --- drivers/vhost/vdpa.c | 30 +++++++++++++++++++++++------- include/uapi/linux/vhost.h | 2 ++ include/uapi/linux/vhost_types.h | 7 +++++++ 3 files changed, 32 insertions(+), 7 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 453057421f80..8f624bbafee7 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -56,7 +56,9 @@ enum { }; enum { - VHOST_VDPA_BACKEND_FEATURES = (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2) + VHOST_VDPA_BACKEND_FEATURES = + (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2) | + (1ULL << VHOST_BACKEND_F_IOTLB_BATCH), }; /* Currently, only network backend w/o multiqueue is supported. */ @@ -77,6 +79,7 @@ struct vhost_vdpa { int virtio_id; int minor; struct eventfd_ctx *config_ctx; + int in_batch; }; static DEFINE_IDA(vhost_vdpa_ida); @@ -125,6 +128,7 @@ static void vhost_vdpa_reset(struct vhost_vdpa *v) const struct vdpa_config_ops *ops = vdpa->config; ops->set_status(vdpa, 0); + v->in_batch = 0; } static long vhost_vdpa_get_device_id(struct vhost_vdpa *v, u8 __user *argp) @@ -540,9 +544,10 @@ static int vhost_vdpa_map(struct vhost_vdpa *v, if (ops->dma_map) r = ops->dma_map(vdpa, iova, size, pa, perm); - else if (ops->set_map) - r = ops->set_map(vdpa, dev->iotlb); - else + else if (ops->set_map) { + if (!v->in_batch) + r = ops->set_map(vdpa, dev->iotlb); + } else r = iommu_map(v->domain, iova, pa, size, perm_to_iommu_flags(perm)); @@ -559,9 +564,10 @@ static void vhost_vdpa_unmap(struct vhost_vdpa *v, u64 iova, u64 size) if (ops->dma_map) ops->dma_unmap(vdpa, iova, size); - else if (ops->set_map) - ops->set_map(vdpa, dev->iotlb); - else + else if (ops->set_map) { + if (!v->in_batch) + ops->set_map(vdpa, dev->iotlb); + } else iommu_unmap(v->domain, iova, size); } @@ -655,6 +661,8 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, struct vhost_iotlb_msg *msg) { struct vhost_vdpa *v = container_of(dev, struct vhost_vdpa, vdev); + struct vdpa_device *vdpa = v->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; int r = 0; r = vhost_dev_check_owner(dev); @@ -668,6 +676,14 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev *dev, case VHOST_IOTLB_INVALIDATE: vhost_vdpa_unmap(v, msg->iova, msg->size); break; + case VHOST_IOTLB_BATCH_BEGIN: + v->in_batch = true; + break; + case VHOST_IOTLB_BATCH_END: + if (v->in_batch && ops->set_map) + ops->set_map(vdpa, dev->iotlb); + v->in_batch = false; + break; default: r = -EINVAL; break; diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h index 0c2349612e77..565da96f55d5 100644 --- a/include/uapi/linux/vhost.h +++ b/include/uapi/linux/vhost.h @@ -91,6 +91,8 @@ /* Use message type V2 */ #define VHOST_BACKEND_F_IOTLB_MSG_V2 0x1 +/* IOTLB can accpet batching hints */typoWill fix.+#define VHOST_BACKEND_F_IOTLB_BATCH 0x2 #define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64) #define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64) diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h index 669457ce5c48..5c12faffdde9 100644 --- a/include/uapi/linux/vhost_types.h +++ b/include/uapi/linux/vhost_types.h @@ -60,6 +60,13 @@ struct vhost_iotlb_msg { #define VHOST_IOTLB_UPDATE 2 #define VHOST_IOTLB_INVALIDATE 3 #define VHOST_IOTLB_ACCESS_FAIL 4 +/* VHOST_IOTLB_BATCH_BEGIN is a hint that userspace will update + * several mappings afterwards. VHOST_IOTLB_BATCH_END is a hint that + * userspace had finished the mapping updating.Well not just hints - in fact updates do not take place until _END. How about: /* VHOST_IOTLB_BATCH_BEGIN and VHOST_IOTLB_BATCH_END allow modifying * multiple mappings in one go: beginning with * VHOST_IOTLB_BATCH_BEGIN, followed by any number of VHOST_IOTLB_UPDATE messages, and ending with VHOST_IOTLB_BATCH_END. */That's better.Is there a guarantee that these changes take place atomically? Let's document that.
There's no guarantee. Will document this. Thanks
When those two flags + * were set, kernel will ignore the rest fileds of the IOTLB message.how about: when one of these two values is used as the message type, the rest of the fields in the message are ignored.Yes. Will fix. Thanks+ */ +#define VHOST_IOTLB_BATCH_BEGIN 5 +#define VHOST_IOTLB_BATCH_END 6 __u8 type; }; -- 2.20.1
_______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization