On Tue, 30 Jan 2024 11:15:07 +0800, Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > On Tue, 30 Jan 2024 10:56:25 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote: > > On Mon, Jan 29, 2024 at 11:28 AM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > > > > > On Mon, 29 Jan 2024 11:06:35 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > > On Thu, Jan 25, 2024 at 2:24 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > On Thu, 25 Jan 2024 11:39:20 +0800, Jason Wang <jasowang@xxxxxxxxxx> wrote: > > > > > > On Tue, Jan 16, 2024 at 3:59 PM Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > If the xsk is enabling, the xsk tx will share the send queue. > > > > > > > > > > > > Any reason for this? Technically, virtio-net can work as other NIC > > > > > > like 256 queues. There could be some work like optimizing the > > > > > > interrupt allocations etc. > > > > > > > > > > Just like the logic of XDP_TX. > > > > > > > > > > Now the virtio spec does not allow to add new dynamic queues. > > > > > As I know, most hypervisors just support few queues. > > > > > > > > When multiqueue is developed in Qemu, it support as least 256 queue > > > > pairs if my memory is correct. > > > > > > > > > > > > > YES, but that is configured by the hypervisor. > > > > > > For the user on any platform, when he got a vm, the queue num is fixed. > > > As I know, on most case, the num is less. > > > If we want the af-xdp/xdp-tx has the the independent queues > > > I think the dynamic queue is good way. > > > > Yes, we can start from this. > > > My plan is start from sharing send queues. > > After that I will push the dynamic queues rfc to the virtio spec. > > If the new feature is negotiated, then we can support xdp/af-xdp > with independent send queues, if the feature is not supported, > xdp/af-xdp can work with sharing send queue. > > I think that will not conflict. cc Parav. > > > > > > > > > > > > > > > The num of > > > > > queues is not bigger than the cpu num. So the best way is > > > > > to share the send queues. > > > > > > > > > > Parav and I tried to introduce dynamic queues. > > > > > > > > Virtio-net doesn't differ from real NIC where most of them can create > > > > queue dynamically. It's more about the resource allocation, if mgmt > > > > can start with 256 queues, then we probably fine. > > > > > > But now, if the devices has 256, we will enable the 256 queues by default. > > > that is too much. > > > > It doesn't differ from the other NIC. E.g currently the active #qps is > > determined by the number of cpus. this is only true if we have 256 > > cpus. > > > YES. But now, the normal devices just have few queues (such as 8, 32). > > Thanks. > > > > > > > > > > So, the dynamic queue is not to create a new queue out of the resource. > > > > > > The device may tell the driver, the max queue resource is 256, > > > but let we start from 8. If the driver need more, then we can > > > enable more. > > > > This is the policy we used now. > > > > > > > > But for me, the xdp tx can share the sq queue, so let we start > > > the af-xdp from sharing sq queue. > > > > > > > > > > > > > > But I think we can leave this question now. > > > > > > > > > But that is dropped. > > > > > Before that I think we can share the send queues. > > > > > > > > > > > > > > > > > > > > > > > But the xsk requires that the send queue use the premapped mode. > > > > > > > So the send queue must support premapped mode. > > > > > > > > > > > > > > command: pktgen_sample01_simple.sh -i eth0 -s 16/1400 -d 10.0.0.123 -m 00:16:3e:12:e1:3e -n 0 -p 100 > > > > > > > machine: ecs.ebmg6e.26xlarge of Aliyun > > > > > > > cpu: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz > > > > > > > iommu mode: intel_iommu=on iommu.strict=1 iommu=nopt > > > > > > > > > > > > > > | iommu off | iommu on > > > > > > > ----------------------|----------------------------------------------------- > > > > > > > | 16 | 1400 | 16 | 1400 > > > > > > > ----------------------|----------------------------------------------------- > > > > > > > Before: |1716796.00 | 1581829.00 | 390756.00 | 374493.00 > > > > > > > After(premapped off): |1733794.00 | 1576259.00 | 390189.00 | 378128.00 > > > > > > > After(premapped on): |1707107.00 | 1562917.00 | 385667.00 | 373584.00 > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@xxxxxxxxxxxxxxxxx> > > > > > > > --- > > > > > > > drivers/net/virtio/main.c | 119 ++++++++++++++++++++++++++++---- > > > > > > > drivers/net/virtio/virtio_net.h | 10 ++- > > > > > > > 2 files changed, 116 insertions(+), 13 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/net/virtio/main.c b/drivers/net/virtio/main.c > > > > > > > index 4fbf612da235..53143f95a3a0 100644 > > > > > > > --- a/drivers/net/virtio/main.c > > > > > > > +++ b/drivers/net/virtio/main.c > > > > > > > @@ -168,13 +168,39 @@ static struct xdp_frame *ptr_to_xdp(void *ptr) > > > > > > > return (struct xdp_frame *)((unsigned long)ptr & ~VIRTIO_XDP_FLAG); > > > > > > > } > > > > > > > > > > > > > > +static void virtnet_sq_unmap_buf(struct virtnet_sq *sq, struct virtio_dma_head *dma) > > > > > > > +{ > > > > > > > + int i; > > > > > > > + > > > > > > > + if (!dma) > > > > > > > + return; > > > > > > > + > > > > > > > + for (i = 0; i < dma->next; ++i) > > > > > > > + virtqueue_dma_unmap_single_attrs(sq->vq, > > > > > > > + dma->items[i].addr, > > > > > > > + dma->items[i].length, > > > > > > > + DMA_TO_DEVICE, 0); > > > > > > > + dma->next = 0; > > > > > > > +} > > > > > > > + > > > > > > > static void __free_old_xmit(struct virtnet_sq *sq, bool in_napi, > > > > > > > u64 *bytes, u64 *packets) > > > > > > > { > > > > > > > + struct virtio_dma_head *dma; > > > > > > > unsigned int len; > > > > > > > void *ptr; > > > > > > > > > > > > > > - while ((ptr = virtqueue_get_buf(sq->vq, &len)) != NULL) { > > > > > > > + if (virtqueue_get_dma_premapped(sq->vq)) { > > > > > > > > > > > > Any chance this.can be false? > > > > > > > > > > __free_old_xmit is the common path. > > > > > > > > Did you mean the XDP path doesn't work with this? If yes, we need to > > > > change that. > > > > > > > > > NO. If the virtio core use_dma_api is false, the dma premapped > > > can not be ture. > > > > Ok, I see. > > > > > > > > > > > > > > > > > > > The virtqueue_get_dma_premapped() is used to check whether the sq is premapped > > > > > mode. > > > > > > > > > > > > > > > > > > + dma = &sq->dma.head; > > > > > > > + dma->num = ARRAY_SIZE(sq->dma.items); > > > > > > > + dma->next = 0; > > > > > > > > > > > > Btw, I found in the case of RX we have: > > > > > > > > > > > > virtnet_rq_alloc(): > > > > > > > > > > > > alloc_frag->offset = sizeof(*dma); > > > > > > > > > > > > This seems to defeat frag coalescing when the memory is highly > > > > > > fragmented or high order allocation is disallowed. > > > > > > > > > > > > Any idea to solve this? > > > > > > > > > > > > > > > On the rq premapped pathset, I answered this. > > > > > > > > > > http://lore.kernel.org/all/1692156147.7470396-3-xuanzhuo@xxxxxxxxxxxxxxxxx > > > > > > > > Oops, I forget that. > > > > > > > > > > > > > > > > > > > > > > + } else { > > > > > > > + dma = NULL; > > > > > > > + } > > > > > > > + > > > > > > > + while ((ptr = virtqueue_get_buf_ctx_dma(sq->vq, &len, dma, NULL)) != NULL) { > > > > > > > + virtnet_sq_unmap_buf(sq, dma); > > > > > > > + > > > > > > > if (!is_xdp_frame(ptr)) { > > > > > > > struct sk_buff *skb = ptr; > > > > > > > > > > > > > > @@ -572,16 +598,70 @@ static void *virtnet_rq_alloc(struct virtnet_rq *rq, u32 size, gfp_t gfp) > > > > > > > return buf; > > > > > > > } > > > > > > > > > > > > > > -static void virtnet_rq_set_premapped(struct virtnet_info *vi) > > > > > > > +static void virtnet_set_premapped(struct virtnet_info *vi) > > > > > > > { > > > > > > > int i; > > > > > > > > > > > > > > - /* disable for big mode */ > > > > > > > - if (!vi->mergeable_rx_bufs && vi->big_packets) > > > > > > > - return; > > > > > > > + for (i = 0; i < vi->max_queue_pairs; i++) { > > > > > > > + virtqueue_set_dma_premapped(vi->sq[i].vq); > > > > > > > > > > > > > > - for (i = 0; i < vi->max_queue_pairs; i++) > > > > > > > - virtqueue_set_dma_premapped(vi->rq[i].vq); > > > > > > > + /* TODO for big mode */ > > > > > > > > > > > > Btw, how hard to support big mode? If we can do premapping for that > > > > > > code could be simplified. > > > > > > > > > > > > (There are vendors that doesn't support mergeable rx buffers). > > > > > > > > > > I will do that after these patch-sets > > > > > > > > If it's not too hard, I'd suggest to do it now. > > > > > > > > > YES. Is not too hard, but I was doing too much. > > > > > > * virtio-net + device stats > > > * virtio-net + af-xdp, this patch set has about 27 commits > > > > > > And I was pushing this too long, I just want to finish the work. > > > Then I can work on the next (premapped big mode, af-xdp multi-buf....). > > > > > > So, let we step by step. > > > > That's fine. > > > > Thanks > > >