Re: [PATCH vhost v2 06/12] virtio_ring: split-indirect: support premapped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 15, 2023 at 2:06 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, 15 Mar 2023 12:47:29 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > On Tue, Mar 14, 2023 at 5:17 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Tue, 14 Mar 2023 15:57:06 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote:
> > > > On Wed, Mar 8, 2023 at 2:44 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > virtio core only supports virtual addresses, dma is completed in virtio
> > > > > core.
> > > > >
> > > > > In some scenarios (such as the AF_XDP), the memory is allocated
> > > > > and DMA is completed in advance, so it is necessary for us to support
> > > > > passing the DMA address to virtio core.
> > > > >
> > > > > Drives can use sg->dma_address to pass the mapped dma address to virtio
> > > > > core. If one sg->dma_address is used then all sgs must use sg->dma_address,
> > > > > otherwise all dma_address must be null.
> > > > >
> > > > > On the indirect path, if dma_address is used, desc_state.indir_desc will
> > > > > be mixed with VRING_INDIRECT_PREMAPPED. So when do unmap, we can pass it.
> > > >
> > > > It's better to mention why indirect descriptors can't be done in the
> > > > same way with direct descriptors.
> > > >
> > > > Btw, if we change the semantics of desc_extra.dma_addr and
> > > > desc_state.indir_desc, we should add comments to definitions of those
> > > > structures.
> > >
> > >
> > > Will fix.
> > >
> > > >
> > > > >
> > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx>
> > > > > ---
> > > > >  drivers/virtio/virtio_ring.c | 28 ++++++++++++++++++++++------
> > > > >  1 file changed, 22 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index 66a071e3bdef..11827d2e56a8 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -231,6 +231,18 @@ static void vring_free(struct virtqueue *_vq);
> > > > >   * Helpers.
> > > > >   */
> > > > >
> > > > > +#define VRING_INDIRECT_PREMAPPED  BIT(0)
> > > > > +
> > > > > +#define desc_mix_dma_map(do_map, desc) \
> > > > > +       (do_map ? desc : (typeof(desc))((unsigned long)(desc) | VRING_INDIRECT_PREMAPPED))
> > > > > +
> > > > > +#define desc_rm_dma_map(desc) \
> > > > > +       ((typeof(desc))((unsigned long)(desc) & ~VRING_INDIRECT_PREMAPPED))
> > > > > +
> > > > > +#define desc_map_inter(desc) \
> > > > > +       !((unsigned long)(desc) & VRING_INDIRECT_PREMAPPED)
> > > > > +
> > > > > +
> > > > >  #define to_vvq(_vq) container_of(_vq, struct vring_virtqueue, vq)
> > > > >
> > > > >  static inline bool virtqueue_use_indirect(struct vring_virtqueue *vq,
> > > > > @@ -725,7 +737,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq,
> > > > >         /* Store token and indirect buffer state. */
> > > > >         vq->split.desc_state[head].data = data;
> > > > >         if (indirect)
> > > > > -               vq->split.desc_state[head].indir_desc = desc;
> > > > > +               vq->split.desc_state[head].indir_desc = desc_mix_dma_map(do_map, desc);
> > > >
> > > > So using indir_desc is kind of hacky (since we don't use indirect for
> > > > rx with extra context).
> > > >
> > > > But at least I think we should seeka way to use the same metadata for
> > > > both direct and indirect descriptors.
> > > >
> > > > E.g can we make them all to use indir_desc?
> > >
> > > I think it may not. My original idea is to use indir_desc uniformly, but
> > > for the scene of saving ctx, we cannot guarantee that the ctx has space for us.
> >
> > Ok, but the problem is that the code became even more hacky (imagine
> > one day we may want to use indirect for RX?).
>
>
> I think it may have nothing to do with RX, but whether ctx is used. Because ctx
> and indirect cannot coexist.
>
>         static inline int virtqueue_add_split(struct virtqueue *_vq,
>                                               struct scatterlist *sgs[],
>                                               unsigned int total_sg,
>                                               unsigned int out_sgs,
>                                               unsigned int in_sgs,
>                                               void *data,
>                                               void *ctx,
>                                               gfp_t gfp)
>         {
>                 struct vring_virtqueue *vq = to_vvq(_vq);
>                 struct scatterlist *sg;
>                 struct vring_desc *desc;
>                 unsigned int i, n, avail, descs_used, prev, err_idx;
>                 int head;
>                 bool indirect;
>
>                 START_USE(vq);
>
>                 BUG_ON(data == NULL);
>         >       BUG_ON(ctx && vq->indirect);
>
> If we want to use ctx with indirect, we must add an dedicated metadata for ctx.

The reason this BUG_ON() is that we do a hack:

For the vq that use ctx it can't use indirect. The only user so far is
the mergeable RX path. This path adds one more hack on top, this seems
a burden for the future maintenance. Consider one day we may want to
use a virtqueue with both indirect and extra context.

We can hear from others.

Thanks

>
> So I think the current plan is OK.
>
> Thanks.
>
> >
> > So I tend to change my mind to introduce dedicated metadata, instead
> > of trying to be packed with two types of the existing ones.
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > > >
> > > > Thanks
> > > >
> > > > >         else
> > > > >                 vq->split.desc_state[head].indir_desc = ctx;
> > > > >
> > > > > @@ -820,22 +832,26 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head,
> > > > >         vq->vq.num_free++;
> > > > >
> > > > >         if (vq->indirect) {
> > > > > -               struct vring_desc *indir_desc =
> > > > > -                               vq->split.desc_state[head].indir_desc;
> > > > > +               struct vring_desc *mix = vq->split.desc_state[head].indir_desc;
> > > > > +               struct vring_desc *indir_desc;
> > > > >                 u32 len;
> > > > >
> > > > >                 /* Free the indirect table, if any, now that it's unmapped. */
> > > > > -               if (!indir_desc)
> > > > > +               if (!mix)
> > > > >                         return;
> > > > >
> > > > > +               indir_desc = desc_rm_dma_map(mix);
> > > > > +
> > > > >                 len = vq->split.desc_extra[head].len;
> > > > >
> > > > >                 BUG_ON(!(vq->split.desc_extra[head].flags &
> > > > >                                 VRING_DESC_F_INDIRECT));
> > > > >                 BUG_ON(len == 0 || len % sizeof(struct vring_desc));
> > > > >
> > > > > -               for (j = 0; j < len / sizeof(struct vring_desc); j++)
> > > > > -                       vring_unmap_one_split_indirect(vq, &indir_desc[j]);
> > > > > +               if (desc_map_inter(mix)) {
> > > > > +                       for (j = 0; j < len / sizeof(struct vring_desc); j++)
> > > > > +                               vring_unmap_one_split_indirect(vq, &indir_desc[j]);
> > > > > +               }
> > > > >
> > > > >                 kfree(indir_desc);
> > > > >                 vq->split.desc_state[head].indir_desc = NULL;
> > > > > --
> > > > > 2.32.0.3.g01195cf9f
> > > > >
> > > >
> > >
> >
>

_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux