Re: [PATCH bpf-next v2 0/4] XDP bonding support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 1, 2021 at 9:20 PM Jay Vosburgh <jay.vosburgh@xxxxxxxxxxxxx> wrote:
>
> joamaki@xxxxxxxxx wrote:
>
> >From: Jussi Maki <joamaki@xxxxxxxxx>
> >
> >This patchset introduces XDP support to the bonding driver.
> >
> >The motivation for this change is to enable use of bonding (and
> >802.3ad) in hairpinning L4 load-balancers such as [1] implemented with
> >XDP and also to transparently support bond devices for projects that
> >use XDP given most modern NICs have dual port adapters.  An alternative
> >to this approach would be to implement 802.3ad in user-space and
> >implement the bonding load-balancing in the XDP program itself, but
> >is rather a cumbersome endeavor in terms of slave device management
> >(e.g. by watching netlink) and requires separate programs for native
> >vs bond cases for the orchestrator. A native in-kernel implementation
> >overcomes these issues and provides more flexibility.
> >
> >Below are benchmark results done on two machines with 100Gbit
> >Intel E810 (ice) NIC and with 32-core 3970X on sending machine, and
> >16-core 3950X on receiving machine. 64 byte packets were sent with
> >pktgen-dpdk at full rate. Two issues [2, 3] were identified with the
> >ice driver, so the tests were performed with iommu=off and patch [2]
> >applied. Additionally the bonding round robin algorithm was modified
> >to use per-cpu tx counters as high CPU load (50% vs 10%) and high rate
> >of cache misses were caused by the shared rr_tx_counter. Fix for this
> >has been already merged into net-next. The statistics were collected
> >using "sar -n dev -u 1 10".
> >
> > -----------------------|  CPU  |--| rxpck/s |--| txpck/s |----
> > without patch (1 dev):
> >   XDP_DROP:              3.15%      48.6Mpps
> >   XDP_TX:                3.12%      18.3Mpps     18.3Mpps
> >   XDP_DROP (RSS):        9.47%      116.5Mpps
> >   XDP_TX (RSS):          9.67%      25.3Mpps     24.2Mpps
> > -----------------------
> > with patch, bond (1 dev):
> >   XDP_DROP:              3.14%      46.7Mpps
> >   XDP_TX:                3.15%      13.9Mpps     13.9Mpps
> >   XDP_DROP (RSS):        10.33%     117.2Mpps
> >   XDP_TX (RSS):          10.64%     25.1Mpps     24.0Mpps
> > -----------------------
> > with patch, bond (2 devs):
> >   XDP_DROP:              6.27%      92.7Mpps
> >   XDP_TX:                6.26%      17.6Mpps     17.5Mpps
> >   XDP_DROP (RSS):       11.38%      117.2Mpps
> >   XDP_TX (RSS):         14.30%      28.7Mpps     27.4Mpps
> > --------------------------------------------------------------
>
>         To be clear, the fact that the performance numbers for XDP_DROP
> and XDP_TX are lower for "with patch, bond (1 dev)" than "without patch
> (1 dev)" is expected, correct?

Yes that is correct. With the patch the ndo callback for choosing the
slave device is invoked which in this test (mode=xor) hashes L2&L3
headers (I seem to have failed to mention this in the original
message). In round-robin mode I recall it being about 16Mpps versus
the 18Mpps without the patch. I did also try "INDIRECT_CALL" to avoid
going via ndo_ops, but that had no discernible effect.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux