Re: [PATCH v2 4/9] virtio_net: build xdp_buff with multi buffers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2022/12/28 下午2:27, Jason Wang 写道:

在 2022/12/27 17:10, Heng Qi 写道:


在 2022/12/27 下午2:46, Jason Wang 写道:

在 2022/12/20 22:14, Heng Qi 写道:
Support xdp for multi buffer packets in mergeable mode.

Putting the first buffer as the linear part for xdp_buff,
and the rest of the buffers as non-linear fragments to struct
skb_shared_info in the tailroom belonging to xdp_buff.

Signed-off-by: Heng Qi <hengqi@xxxxxxxxxxxxxxxxx>
Reviewed-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx>
---
  drivers/net/virtio_net.c | 78 ++++++++++++++++++++++++++++++++++++++++
  1 file changed, 78 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 08f209d7b0bf..8fc3b1841d92 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -931,6 +931,84 @@ static struct sk_buff *receive_big(struct net_device *dev,
      return NULL;
  }
  +/* TODO: build xdp in big mode */
+static int virtnet_build_xdp_buff_mrg(struct net_device *dev,
+                      struct virtnet_info *vi,
+                      struct receive_queue *rq,
+                      struct xdp_buff *xdp,
+                      void *buf,
+                      unsigned int len,
+                      unsigned int frame_sz,
+                      u16 *num_buf,
+                      unsigned int *xdp_frags_truesize,
+                      struct virtnet_rq_stats *stats)
+{
+    unsigned int tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+    struct virtio_net_hdr_mrg_rxbuf *hdr = buf;
+    unsigned int truesize, headroom;
+    struct skb_shared_info *shinfo;
+    unsigned int xdp_frags_truesz = 0;
+    unsigned int cur_frag_size;
+    struct page *page;
+    skb_frag_t *frag;
+    int offset;
+    void *ctx;
+
+    xdp_init_buff(xdp, frame_sz, &rq->xdp_rxq);
+    xdp_prepare_buff(xdp, buf - VIRTIO_XDP_HEADROOM,
+             VIRTIO_XDP_HEADROOM + vi->hdr_len, len - vi->hdr_len, true);
+
+    if (*num_buf > 1) {
+        shinfo = xdp_get_shared_info_from_buff(xdp);
+        shinfo->nr_frags = 0;
+        shinfo->xdp_frags_size = 0;
+    }
+
+    if ((*num_buf - 1) > MAX_SKB_FRAGS)
+        return -EINVAL;
+
+    while ((--*num_buf) >= 1) {
+        buf = virtqueue_get_buf_ctx(rq->vq, &len, &ctx);
+        if (unlikely(!buf)) {
+            pr_debug("%s: rx error: %d buffers out of %d missing\n",
+                 dev->name, *num_buf,
+                 virtio16_to_cpu(vi->vdev, hdr->num_buffers));
+            dev->stats.rx_length_errors++;
+            return -EINVAL;
+        }
+
+        if (!xdp_buff_has_frags(xdp))
+            xdp_buff_set_frags_flag(xdp);


Any reason to put this inside the loop?

I'll move it outside the loop in the next version.



+
+        stats->bytes += len;
+        page = virt_to_head_page(buf);
+        offset = buf - page_address(page);
+        truesize = mergeable_ctx_to_truesize(ctx);
+        headroom = mergeable_ctx_to_headroom(ctx);
+
+        cur_frag_size = truesize + (headroom ? (headroom + tailroom) : 0);
+        xdp_frags_truesz += cur_frag_size;


Not related to this patch, but it would easily confuse the future readers that the we need another math for truesize. I think at least we need some comments for this or

Yes it might, I'll add more comments on this.


I guess the root cause is in get_mergeable_buf_len:

static unsigned int get_mergeable_buf_len(struct receive_queue *rq,
                                      struct ewma_pkt_len *avg_pkt_len,
                                          unsigned int room)
{
        struct virtnet_info *vi = rq->vq->vdev->priv;
        const size_t hdr_len = vi->hdr_len;
        unsigned int len;

        if (room)
        return PAGE_SIZE - room;

And we do

    len = get_mergeable_buf_len(rq, &rq->mrg_avg_pkt_len, room);

    ...

    ctx = mergeable_len_to_ctx(len, headroom);


I wonder if it's better to pack the real truesize (PAGE_SIZE) here. This may ease a lot of things.

I don't know the historical reason for not packing, but I guess this is for the convenience of comparing the actual length len given by the device with truesize. Therefore, I think it would
be better to keep the above practice for now?


The problem is:

We had

static unsigned int mergeable_ctx_to_truesize(void *mrg_ctx)

It means the truesize should be calculated before packing into the context. So having another math to get truesize is very confusing, I think it's better to fix that. Otherwise we may end up code that is hard to be reviewed even with the help of comments.

It is reasonable to let truesize return to its literal meaning, which includes the length of
headroom and tailroom, and I will modify this in the next version.

Thanks.


Thanks


Perhaps, I can add more explanation to the
above code to help future readers try not to get confused.

Thanks.


Thanks


+        if (unlikely(len > truesize || cur_frag_size > PAGE_SIZE)) {
+            pr_debug("%s: rx error: len %u exceeds truesize %lu\n",
+                 dev->name, len, (unsigned long)ctx);
+            dev->stats.rx_length_errors++;
+            return -EINVAL;
+        }
+
+        frag = &shinfo->frags[shinfo->nr_frags++];
+        __skb_frag_set_page(frag, page);
+        skb_frag_off_set(frag, offset);
+        skb_frag_size_set(frag, len);
+        if (page_is_pfmemalloc(page))
+            xdp_buff_set_frag_pfmemalloc(xdp);
+
+        shinfo->xdp_frags_size += len;
+    }
+
+    *xdp_frags_truesize = xdp_frags_truesz;
+    return 0;
+}
+
  static struct sk_buff *receive_mergeable(struct net_device *dev,
                       struct virtnet_info *vi,
                       struct receive_queue *rq,





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux