On Mon, Jan 13, 2020 at 1:28 AM Ryan Goodfellow <rgoodfel@xxxxxxx> wrote: > > Greetings XDP folks. I've been working on a zero-copy XDP bridge > implementation similar to what's described in the following thread. > > https://www.spinics.net/lists/xdp-newbies/msg01333.html > > I now have an implementation that is working reasonably well under certain > conditions for various hardware. The implementation is primarily based on the > xdpsock_user program in the kernel under samples/bpf. You can find my program > and corresponding BPF program here. > > - https://gitlab.com/mergetb/tech/network-emulation/kernel/blob/v5.5-moa/samples/bpf/xdpsock_multidev.c > - https://gitlab.com/mergetb/tech/network-emulation/kernel/blob/v5.5-moa/samples/bpf/xdpsock_multidev_kern.c > > I have small testbed to run this code on that looks like the following. > > Packet forwarding machine: > CPU: Intel(R) Xeon(R) D-2146NT CPU @ 2.30GHz (8 core / 16 thread) > Memory: 32 GB > NICs: > - Mellanox ConnectX 4 Dual 100G MCX416A-CCAT (connected at 40G) > - Intel X722 10G SFP+ > > Sender/receiver machines > CPU: Intel(R) Xeon(R) D-2146NT CPU @ 2.30GHz (8 core / 16 thread) > Memory: 32 GB > NICs: > - Mellanox ConnectX 4 40G MCX4131A-BCAT > - Intel X722 10G SFP+ > > I could not get zero-copy to work with the i40e driver as it would crash. I've > attached the corresponding traces from dmesg. The results below are with the > i40e running in SKB/copy mode. I do have an X710-DA4 that I could plug into the > server and test with instead of the X722 if that is of interest. In all cases I > used a single hardware queue via the following. > > ethtool -L <dev> combined 1 > > The Mellanox cards in zero-copy mode create a sort of shadow set of queues, I > used ntuple rules to push things through queue 1 (shadows 0) as follows > > ethtool -N <dev> flow-type ether src <mac> action 1 > > The numbers that I have been able to achive with this code are the following. MTU > is 1500 in all cases. > > mlx5: pps ~ 2.4 Mpps, 29 Gbps (driver mode, zero-copy) > i40e: pps ~ 700 Kpps, 8 Gbps (skb mode, copy) > virtio: pps ~ 200 Kpps, 2.4 Gbps (skb mode, copy, all qemu/kvm VMs) > > Are these numbers in the ballpark of what's expected? > > One thing I have noticed is that I cannot create large memory maps for the > packet buffers. For example a frame size of 2048 with 524288 frames (around > 1G of packets) is fine. However, increasing size by an order of magnitude, which > is well within the memory capacity of the host machine results in an error when > creating the UMEM and the kernel shows the attached call trace. I'm going to > begin investigating this in more detail soon, but if anyone has advice on large > XDP memory maps that would be much appreciated. Hi Ryan, Thanks for taking XDP and AF_XDP for a sping. I will start by fixing this out-of-memory issue. With your umem size, we are hitting the size limit of kmalloc. I will fix this by using kvmalloc that tries to allocate with vmalloc if kmalloc fails. Should hopefully make it possible for you to allocate larger umems. > The reason for wanting large memory maps is that our use case for XDP is network > emulation - and sometimes that means introducing delay factors that can require > a rather large in-memory packet buffers. > > If there is interest in including this program in the official BPF samples I'm happy to > submit a patch. Any comments on the program are also much appreciated. More examples are always useful, but the question is if it should reside in samples or outside the kernel in some other repo? Is there some good place in xdp-project github that could be used for this purpose? /Magnus > Thanks > > ~ ry