Re: [PATCH vhost v2 06/12] virtio_ring: split-indirect: support premapped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 16 Mar 2023 10:49:54 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> On Wed, Mar 15, 2023 at 2:06 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
> >
> > On Wed, 15 Mar 2023 12:47:29 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > > On Tue, Mar 14, 2023 at 5:17 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On Tue, 14 Mar 2023 15:57:06 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > > > > On Wed, Mar 8, 2023 at 2:44 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > virtio core only supports virtual addresses, dma is completed in virtio
> > > > > > core.
> > > > > >
> > > > > > In some scenarios (such as the AF_XDP), the memory is allocated
> > > > > > and DMA is completed in advance, so it is necessary for us to support
> > > > > > passing the DMA address to virtio core.
> > > > > >
> > > > > > Drives can use sg->dma_address to pass the mapped dma address to virtio
> > > > > > core. If one sg->dma_address is used then all sgs must use sg->dma_address,
> > > > > > otherwise all dma_address must be null.
> > > > > >
> > > > > > On the indirect path, if dma_address is used, desc_state.indir_desc will
> > > > > > be mixed with VRING_INDIRECT_PREMAPPED. So when do unmap, we can pass it.
> > > > >
> > > > > It's better to mention why indirect descriptors can't be done in the
> > > > > same way with direct descriptors.
> > > > >
> > > > > Btw, if we change the semantics of desc_extra.dma_addr and
> > > > > desc_state.indir_desc, we should add comments to definitions of those
> > > > > structures.
> > > >
> > > >
> > > > Will fix.
> > > >
> > > > >
> > > > > >
> > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx>
> > > > > > ---
> > > > > >  drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++------
> > > > > >  1 file changed, 22 insertions(+), 6 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index 66a071e3bdef..11827d2e56a8 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -231,6 +231,18 @@ static void vring_free(struct virtqueue *_vq);
> > > > > >   * Helpers.
> > > > > >   */
> > > > > >
> > > > > > +#define VRING_INDIRECT_PREMAPPED  BIT(0)
> > > > > > +
> > > > > > +#define desc_mix_dma_map(do_map, desc) \
> > > > > > +       (do_map ? desc : (typeof(desc))((unsigned long)(desc) | VRING_INDIRECT_PREMAPPED))
> > > > > > +
> > > > > > +#define desc_rm_dma_map(desc) \
> > > > > > +       ((typeof(desc))((unsigned long)(desc) & ~VRING_INDIRECT_PREMAPPED))
> > > > > > +
> > > > > > +#define desc_map_inter(desc) \
> > > > > > +       !((unsigned long)(desc) & VRING_INDIRECT_PREMAPPED)
> > > > > > +
> > > > > > +
> > > > > >  #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
> > > > > >
> > > > > >  static inline bool virtqueue_use_indirect(struct vring_virtqueue *vq,
> > > > > > @@ -725,7 +737,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
> > > > > >         /* Store token and indirect buffer state. */
> > > > > >         vq->split.desc_state[head].data = data;
> > > > > >         if (indirect)
> > > > > > -               vq->split.desc_state[head].indir_desc = desc;
> > > > > > +               vq->split.desc_state[head].indir_desc = desc_mix_dma_map(do_map, desc);
> > > > >
> > > > > So using indir_desc is kind of hacky (since we don't use indirect for
> > > > > rx with extra context).
> > > > >
> > > > > But at least I think we should seeka way to use the same metadata for
> > > > > both direct and indirect descriptors.
> > > > >
> > > > > E.g can we make them all to use indir_desc?
> > > >
> > > > I think it may not. My original idea is to use indir_desc uniformly, but
> > > > for the scene of saving ctx, we cannot guarantee that the ctx has space for us.
> > >
> > > Ok, but the problem is that the code became even more hacky (imagine
> > > one day we may want to use indirect for RX?).
> >
> >
> > I think it may have nothing to do with RX, but whether ctx is used. Because ctx
> > and indirect cannot coexist.
> >
> >         static inline int virtqueue_add_split(struct virtqueue *_vq,
> >                                               struct scatterlist *sgs[],
> >                                               unsigned int total_sg,
> >                                               unsigned int out_sgs,
> >                                               unsigned int in_sgs,
> >                                               void *data,
> >                                               void *ctx,
> >                                               gfp_t gfp)
> >         {
> >                 struct vring_virtqueue *vq = to_vvq(_vq);
> >                 struct scatterlist *sg;
> >                 struct vring_desc *desc;
> >                 unsigned int i, n, avail, descs_used, prev, err_idx;
> >                 int head;
> >                 bool indirect;
> >
> >                 START_USE(vq);
> >
> >                 BUG_ON(data == NULL);
> >         >       BUG_ON(ctx && vq->indirect);
> >
> > If we want to use ctx with indirect, we must add an dedicated metadata for ctx.
>
> The reason this BUG_ON() is that we do a hack:
>
> For the vq that use ctx it can't use indirect. The only user so far is
> the mergeable RX path. This path adds one more hack on top, this seems
> a burden for the future maintenance. Consider one day we may want to
> use a virtqueue with both indirect and extra context.

OK. I see.

I will add a dedicated metadata if no new advise.

Thanks.

>
> We can hear from others.
>
> Thanks
>
> >
> > So I think the current plan is OK.
> >
> > Thanks.
> >
> > >
> > > So I tend to change my mind to introduce dedicated metadata, instead
> > > of trying to be packed with two types of the existing ones.
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > >         else
> > > > > >                 vq->split.desc_state[head].indir_desc = ctx;
> > > > > >
> > > > > > @@ -820,22 +832,26 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head,
> > > > > >         vq->vq.num_free++;
> > > > > >
> > > > > >         if (vq->indirect) {
> > > > > > -               struct vring_desc *indir_desc =
> > > > > > -                               vq->split.desc_state[head].indir_desc;
> > > > > > +               struct vring_desc *mix = vq->split.desc_state[head].indir_desc;
> > > > > > +               struct vring_desc *indir_desc;
> > > > > >                 u32 len;
> > > > > >
> > > > > >                 /* Free the indirect table, if any, now that it's unmapped. */
> > > > > > -               if (!indir_desc)
> > > > > > +               if (!mix)
> > > > > >                         return;
> > > > > >
> > > > > > +               indir_desc = desc_rm_dma_map(mix);
> > > > > > +
> > > > > >                 len = vq->split.desc_extra[head].len;
> > > > > >
> > > > > >                 BUG_ON(!(vq->split.desc_extra[head].flags &
> > > > > >                                 VRING_DESC_F_INDIRECT));
> > > > > >                 BUG_ON(len == 0 || len % sizeof(struct vring_desc));
> > > > > >
> > > > > > -               for (j = 0; j < len / sizeof(struct vring_desc); j++)
> > > > > > -                       vring_unmap_one_split_indirect(vq, &indir_desc[j]);
> > > > > > +               if (desc_map_inter(mix)) {
> > > > > > +                       for (j = 0; j < len / sizeof(struct vring_desc); j++)
> > > > > > +                               vring_unmap_one_split_indirect(vq, &indir_desc[j]);
> > > > > > +               }
> > > > > >
> > > > > >                 kfree(indir_desc);
> > > > > >                 vq->split.desc_state[head].indir_desc = NULL;
> > > > > > --
> > > > > > 2.32.0.3.g01195cf9f
> > > > > >
> > > > >
> > > >
> > >
> >
>
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux