Re: [PATCH vhost v9 05/12] virtio_ring: split: virtqueue_add_split() support premapped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




在 2023/5/18 17:49, Michael S. Tsirkin 写道:
On Thu, May 18, 2023 at 05:14:03PM +0800, Xuan Zhuo wrote:
On Thu, 18 May 2023 16:57:37 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote:
On Thu, May 18, 2023 at 4:29 PM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote:
On Thu, May 18, 2023 at 03:33:52PM +0800, Xuan Zhuo wrote:
On Thu, 18 May 2023 03:11:25 -0400, "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
On Thu, May 18, 2023 at 02:51:57PM +0800, Jason Wang wrote:
On Wed, May 17, 2023 at 10:23 AM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
virtqueue_add_split() only supports virtual addresses, dma is completed
in virtqueue_add_split().

In some scenarios (such as the AF_XDP scenario), the memory is allocated
and DMA is completed in advance, so it is necessary for us to support
passing the DMA address to virtqueue_add_split().

Record this information in desc_state, we can skip unmap based on this
when executing dma unmap.
I would also suggest documenting why a per descriptor metadata is
needed instead of a per virtqueue one.
I think we could make it per virtqueue. That would mean all code in
virtio net would have to change to do dma mapping itself instead of
relying on virtio core though.  Which is maybe a good idea? Definitely a
very intrusive change though, will need a lot of performance testing
to make sure we don't break anything.
In fact, we have tried this idea.

The problem is the detach and unmap.

We need to get all DMA Addresses from virtio-ring to unmap. Currently, it does
not support to return the DMA Address, and for SKB, we need to get multiple DMA
Addresses at one time.

This need to modify the logic of Virtio-Ring detach. Besides this, I also agree
with this idea.

Thanks.
Well you can have a version of get_buf that returns them ... but
it is not clear to me all this is worth it unless you want
to do unsafe tricks like leaving them mapped.
Some high speed NIC drivers use this trick for better performance.

Interesting, this is the first time I know this. Is there any problem?
depends - if you are relying on the IOMMU then yes - malicious hardware
can steal guest secrets or corrupt memory since it's a hack not properly
integrated with linux and there's no real control preventing linux from
reusing this memory for something unrelated.


The pages are pre-allocated/mapped buffers for RX. So it should be fine.

Thanks


If instead you are using something like bounce buffers then no, but OTOH
bounce buffers are already expensive so you might not see a lot
of benefit.

So, is that virtio-net master the operation of dma by itself the right way?

Thanks
I am fine with the approach taken for now. And look at reducing
cost of dma map/unmap later.


I'd leave that
for another day maybe.

For marking desc as premapped I think we can use a bit from
desc_extra->flags, either reusing one of NEXT,AVAIL,USED, or stealing
another one.
Probably.

Thanks






Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx>
---
  drivers/virtio/virtio_ring.c | 38 +++++++++++++++++++++++++++---------
  1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index e2fc50c05bec..bd5e84afab37 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -70,6 +70,7 @@
  struct vring_desc_state_split {
         void *data;                     /* Data for callback. */
         struct vring_desc *indir_desc;  /* Indirect descriptor, if any. */
+       bool premapped;                 /* DMA mapping is done by driver. */
Going back to the original discussion around where this should be
placed. I wonder if we can find a common place to store this since it
has nothing related to virtqueue layout. Maybe desc_extra? And it
would be even better if we can avoid stressing the cache like above.

  };

  struct vring_desc_state_packed {
@@ -356,8 +357,14 @@ static struct device *vring_dma_dev(const struct vring_virtqueue *vq)

  /* Map one sg entry. */
  static int vring_map_one_sg(const struct vring_virtqueue *vq, struct scatterlist *sg,
-                           enum dma_data_direction direction, static dma_addr_t *addr)
+                           enum dma_data_direction direction,
+                           bool premapped, dma_addr_t *addr)
having things like:

int func(bool do)
{
if (!do)
     return;
}

is a hint that the check needs to be done by the caller?

And this change should work for both packed and split. I think we need
to squash the packed changes here.

Looking at how packed virtqueue uses this in this patch, I don't think
this patch can even be built. I will wait for a new version and
continue the review from there.

Thanks



  {
+       if (premapped) {
+               *addr = sg_dma_address(sg);
+               return 0;
+       }
+
         if (!vq->use_dma_api) {
                 /*
                  * If DMA is not used, KMSAN doesn't know that the scatterlist
@@ -445,7 +452,7 @@ static void vring_unmap_one_split_indirect(const struct vring_virtqueue *vq,
  }

  static unsigned int vring_unmap_one_split(const struct vring_virtqueue *vq,
-                                         unsigned int i)
+                                         unsigned int i, bool premapped)
  {
         struct vring_desc_extra *extra = vq->split.desc_extra;
         u16 flags;
@@ -462,6 +469,9 @@ static unsigned int vring_unmap_one_split(const struct vring_virtqueue *vq,
                                  (flags & VRING_DESC_F_WRITE) ?
                                  DMA_FROM_DEVICE : DMA_TO_DEVICE);
         } else {
+               if (premapped)
+                       goto out;
+
                 dma_unmap_page(vring_dma_dev(vq),
                                extra[i].addr,
                                extra[i].len,
@@ -532,6 +542,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
                                       unsigned int in_sgs,
                                       void *data,
                                       void *ctx,
+                                     bool premapped,
                                       gfp_t gfp)
  {
         struct vring_virtqueue *vq = to_vvq(_vq);
@@ -595,7 +606,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
                 for (sg = sgs[n]; sg; sg = sg_next(sg)) {
                         dma_addr_t addr;

-                       if (vring_map_one_sg(vq, sg, DMA_TO_DEVICE, &addr))
+                       if (vring_map_one_sg(vq, sg, DMA_TO_DEVICE, premapped, &addr))
                                 goto unmap_release;

                         prev = i;
@@ -611,7 +622,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
                 for (sg = sgs[n]; sg; sg = sg_next(sg)) {
                         dma_addr_t addr;

-                       if (vring_map_one_sg(vq, sg, DMA_FROM_DEVICE, &addr))
+                       if (vring_map_one_sg(vq, sg, DMA_FROM_DEVICE, premapped, &addr))
                                 goto unmap_release;

                         prev = i;
@@ -657,6 +668,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,

         /* Store token and indirect buffer state. */
         vq->split.desc_state[head].data = data;
+       vq->split.desc_state[head].premapped = premapped;
         if (indirect)
                 vq->split.desc_state[head].indir_desc = desc;
         else
@@ -686,6 +698,14 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
         return 0;

  unmap_release:
+       if (premapped) {
+               if (indirect)
+                       kfree(desc);
+
+               END_USE(vq);
+               return -ENOMEM;
+       }
+
         err_idx = i;

         if (indirect)
@@ -700,7 +720,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
                         vring_unmap_one_split_indirect(vq, &desc[i]);
                         i = virtio16_to_cpu(_vq->vdev, desc[i].next);
                 } else
-                       i = vring_unmap_one_split(vq, i);
+                       i = vring_unmap_one_split(vq, i, false);
         }

         if (indirect)
@@ -757,12 +777,12 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head,
         i = head;

         while (vq->split.vring.desc[i].flags & nextflag) {
-               vring_unmap_one_split(vq, i);
+               vring_unmap_one_split(vq, i, state->premapped);
                 i = vq->split.desc_extra[i].next;
                 vq->vq.num_free++;
         }

-       vring_unmap_one_split(vq, i);
+       vring_unmap_one_split(vq, i, state->premapped);
         vq->split.desc_extra[i].next = vq->free_head;
         vq->free_head = head;

@@ -783,7 +803,7 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head,
                                 VRING_DESC_F_INDIRECT));
                 BUG_ON(len == 0 || len % sizeof(struct vring_desc));

-               if (vq->use_dma_api) {
+               if (vq->use_dma_api && !state->premapped) {
                         for (j = 0; j < len / sizeof(struct vring_desc); j++)
                                 vring_unmap_one_split_indirect(vq, &indir_desc[j]);
                 }
@@ -2143,7 +2163,7 @@ static inline int virtqueue_add(struct virtqueue *_vq,
         return vq->packed_ring ? virtqueue_add_packed(_vq, sgs, total_sg,
                                         out_sgs, in_sgs, data, ctx, gfp) :
                                  virtqueue_add_split(_vq, sgs, total_sg,
-                                       out_sgs, in_sgs, data, ctx, gfp);
+                                       out_sgs, in_sgs, data, ctx, premapped, gfp);
  }

  /**
--
2.32.0.3.g01195cf9f


_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux