On Wed, Jun 9, 2021 at 6:55 AM Jussi Maki <joamaki@xxxxxxxxx> wrote: > > This patchset introduces XDP support to the bonding driver. > > Patch 1 contains the implementation, including support for > the recently introduced EXCLUDE_INGRESS. Patch 2 contains a > performance fix to the roundrobin mode which switches rr_tx_counter > to be per-cpu. Patch 3 contains the test suite for the implementation > using a pair of veth devices. > > The vmtest.sh is modified to enable the bonding module and install > modules. The config change should probably be done in the libbpf > repository. Andrii: How would you like this done properly? I think vmtest.sh and CI setup doesn't support modules (not easily at least). Can we just compile that driver in? Then you can submit a PR against libbpf Github repo to adjust the config. We have also kernel CI repo where we'll need to make this change. > > The motivation for this change is to enable use of bonding (and > 802.3ad) in hairpinning L4 load-balancers such as [1] implemented with > XDP and also to transparently support bond devices for projects that > use XDP given most modern NICs have dual port adapters. An alternative > to this approach would be to implement 802.3ad in user-space and > implement the bonding load-balancing in the XDP program itself, but > is rather a cumbersome endeavor in terms of slave device management > (e.g. by watching netlink) and requires separate programs for native > vs bond cases for the orchestrator. A native in-kernel implementation > overcomes these issues and provides more flexibility. > > Below are benchmark results done on two machines with 100Gbit > Intel E810 (ice) NIC and with 32-core 3970X on sending machine, and > 16-core 3950X on receiving machine. 64 byte packets were sent with > pktgen-dpdk at full rate. Two issues [2, 3] were identified with the > ice driver, so the tests were performed with iommu=off and patch [2] > applied. Additionally the bonding round robin algorithm was modified > to use per-cpu tx counters as high CPU load (50% vs 10%) and high rate > of cache misses were caused by the shared rr_tx_counter (see patch > 2/3). The statistics were collected using "sar -n dev -u 1 10". > > -----------------------| CPU |--| rxpck/s |--| txpck/s |---- > without patch (1 dev): > XDP_DROP: 3.15% 48.6Mpps > XDP_TX: 3.12% 18.3Mpps 18.3Mpps > XDP_DROP (RSS): 9.47% 116.5Mpps > XDP_TX (RSS): 9.67% 25.3Mpps 24.2Mpps > ----------------------- > with patch, bond (1 dev): > XDP_DROP: 3.14% 46.7Mpps > XDP_TX: 3.15% 13.9Mpps 13.9Mpps > XDP_DROP (RSS): 10.33% 117.2Mpps > XDP_TX (RSS): 10.64% 25.1Mpps 24.0Mpps > ----------------------- > with patch, bond (2 devs): > XDP_DROP: 6.27% 92.7Mpps > XDP_TX: 6.26% 17.6Mpps 17.5Mpps > XDP_DROP (RSS): 11.38% 117.2Mpps > XDP_TX (RSS): 14.30% 28.7Mpps 27.4Mpps > -------------------------------------------------------------- > > RSS: Receive Side Scaling, e.g. the packets were sent to a range of > destination IPs. > > [1]: https://cilium.io/blog/2021/05/20/cilium-110#standalonelb > [2]: https://lore.kernel.org/bpf/20210601113236.42651-1-maciej.fijalkowski@xxxxxxxxx/T/#t > [3]: https://lore.kernel.org/bpf/CAHn8xckNXci+X_Eb2WMv4uVYjO2331UWB2JLtXr_58z0Av8+8A@xxxxxxxxxxxxxx/ > > --- > > Jussi Maki (3): > net: bonding: Add XDP support to the bonding driver > net: bonding: Use per-cpu rr_tx_counter > selftests/bpf: Add tests for XDP bonding > > drivers/net/bonding/bond_main.c | 459 +++++++++++++++--- > include/linux/filter.h | 13 +- > include/linux/netdevice.h | 5 + > include/net/bonding.h | 3 +- > kernel/bpf/devmap.c | 34 +- > net/core/filter.c | 37 +- > .../selftests/bpf/prog_tests/xdp_bonding.c | 342 +++++++++++++ > tools/testing/selftests/bpf/vmtest.sh | 30 +- > 8 files changed, 843 insertions(+), 80 deletions(-) > create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_bonding.c > > -- > 2.30.2 >