Re: [xdp-cloud] Questions about Offloads and XDP-Hints regarding a Cloud-Provider Use-Case

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 28/09/2022 15.54, Marcus Wichelmann wrote:

I'm working for a cloud hosting provider and we're working on a new XDP-based networking stack for our VM-Hosts that uses XDP to accelerate the connectivity of our qemu/KVM VMs to the outside.


Welcome to the community! Sounds like an excellent use-case and
opportunity for speeding up the RX packets from physical NIC into the
VM.  Good to hear someone (again) having this use-case. I've personally
not been focused on this use-case lately, mostly because community
members that I was interacting with changed jobs, away from cloud
hosting companies. Good to have a user back in this area!


For this, we use XDP_REDIRECT to forward packets between the physical host NIC and the VM tap devices. The main issue we have now is, that our VM guests have some virtio NIC offloads enabled: rx/tx checksumming, TSO/GSO, GRO and Scatter-Gather.

Supporting RX-checksumming is part of the plans for XDP-hints, although
virtio_net is not part of my initial patchset.

XDP-redirect with GRO and Scatter-Gather frames are part of the
multi-buff effort (Cc Lorenzo), but currently XDP_REDIRECT with
multi-buff is disabled (except for cpumap), because the lack of
XDP-feature bits, meaning we cannot determine (in kernel) if receiving
net_device supports multi-buff (Cc Kumar).

Currently, these offloads (especially TSO/GSO) are incompatible with XDP_REDIRECT and result in packets being dropped. Because disabling these offloads in all our customer VMs is not a good option, we're searching for ways to support these offloads with XDP.


To David Ahern, didn't the kernel recently loosen up on having to
disable these offloads for KVM virtio_net?

My (long term) goal is to improve the situation and allow more offloads
to get enabled for KVM/virtio_net. But my current focus is on veth.


The XDP multi-buffer support needed for TSO/GSO seems to be mostly there

A subtle detail is that both XDP-hints and XDP multi-buff are needed to
get GRO/GSO kernel infra working.  For the kernel to construct GRO-SKB
based packets on XDP-redirected incoming xdp_frame's, the kernel code
requires both RX-csum and RX-hash before coalescing GRO frames.

already, but, to our understanding, the last missing part for full TSO/GSO support is a way to tell the physical NIC to perform the TSO/GSO offload.


The TSO/GSO side is usually the TX side.  The VM should be able to send
out normal TSO/GSO (multi-buffer) packets.  Or are you saying this also
gets disabled when enabling XDP on the virtio_net RX side?


I've seen  the latest LPC 2022 talk from Jesper Dangaard Brouer regarding the planned XDP-Hints feature. But this was mainly about Checksum and VLAN offloads. Is supporting TSO/GSO also one of the goals you have in mind with these XDP-Hints proposals?


As mentioned TSO/GSO is TX side. We (Cc Magnus) also want to extend
XDP-hints to TX-side, to allow asking the HW to perform different
offloads. Lets land RX-side first.

Will the multi-buffer and hints patches be all what's needed to make XDP_REDIRECT between a VM (without disabled offloads) and the host NIC possible, or are there more things missing in XDP that will become an issue in that use-case?


As hinted above, we also need net_device XDP-features to enable
redirecting XDP multi-buff frames.


The "XDP Cloud-Provider" project page describes a very similar use-case to what we plan to do. What's the goal of this project?


Yes, this sounds VERY similar to your use-case.

I think you are referring to this:
 [1] https://xdp-project.net/areas/xdp-cloud-provider.html
 [2] https://github.com/xdp-project/xdp-cloud

We had two Cloud Hosting companies interested in this use-case and
started a "sub" xdp-project, with the intent of working together on
code[2] that implements concrete BPF tools, that functions as building
blocks that the individual companies can integrate into their systems,
separating out customer provisioning to the companies.
(p.s. this approach have worked well for xdp-cpumap-tc[3] scaling tool)

Unfortunately the Cloud-Provider project "died", because the engineers
from the Cloud Hosting companies got better job offers.  And other
engineers at these companies were apparently not motivated to takeover.


[3] https://github.com/xdp-project/xdp-cpumap-tc

We're very interested in the work on XDP-Hints and in the performance benefits that the offloading support could bring to XDP and I would be thankful if you could help us with some of our questions.

Thanks for showing interest in XDP-hints, this is very motivating for me
personally to continue this work upstream.

--Jesper




[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux