Re: Redirect to AF_XDP socket not working with bond interface in native mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 19 Dec 2023 at 21:18, Prashant Batra <prbatra.mail@xxxxxxxxx> wrote:
>
> Thanks for your response. My comments inline.
>
> On Tue, Dec 19, 2023 at 7:17 PM Magnus Karlsson
> <magnus.karlsson@xxxxxxxxx> wrote:
> >
> > On Tue, 19 Dec 2023 at 11:46, Prashant Batra <prbatra.mail@xxxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > I am new to XDP and exploring it's working with different interface
> > > types supported in linux. One of my use cases is to be able to receive
> > > packets from the bond interface.
> > > I used xdpsock sample program specifying the bond interface as the
> > > input interface. However the packets received on the bond interface
> > > are not handed over to the socket by the kernel if the socket is bound
> > > in native mode. The packets are neither being passed to the kernel.
> > > Note that the socket creation does succeed.
> > > In skb mode this works and I am able to receive packets in the
> > > userspace. But in skb mode as expected the performance is not that
> > > great.
> > >
> > > Is AF_XDP sockets on bond not supported in native mode? Or since the
> > > packet has be to be handed over to the bond driver post reception on
> > > the phy port, a skb allocation and copy to it is indeed a must?
> >
> > I have never tried a bonding interface with AF_XDP, so it might not
> > work. Can you trace the packet to see where it is being dropped in
> > native mode? There are no modifications needed to an XDP_REDIRECT
> > enabled driver to support AF_XDP in XDP_DRV / copy mode. What NICs are
> > you using?
> >
> I will trace the packet and get back.
> The bond is over 2 physical ports part of the Intel NIC card. Those are-
> b3:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
> SFI/SFP+ Network Connection (rev 01)
> b3:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
> SFI/SFP+ Network Connection (rev 01)
>
> Bonding algo is 802.3ad
>
> CPU is Intel Xeon Gold 3.40GHz
>
> NIC Driver
> # ethtool -i ens1f0
> driver: ixgbe
> version: 5.14.0-362.13.1.el9_3

Could you please try with the latest kernel 6.7? 5.14 is quite old and
a lot of things have happened since then.

> Features
> # xdp-loader features ens1f0
> NETDEV_XDP_ACT_BASIC:           yes
> NETDEV_XDP_ACT_REDIRECT:        yes
> NETDEV_XDP_ACT_NDO_XMIT:        no
> NETDEV_XDP_ACT_XSK_ZEROCOPY:    yes
> NETDEV_XDP_ACT_HW_OFFLOAD:      no
> NETDEV_XDP_ACT_RX_SG:           no
> NETDEV_XDP_ACT_NDO_XMIT_SG:     no
>
> CPU is
>
> Interesting thing is that the bond0 does advertise both native and ZC
> mode. That's because the features are copied from the slave device.
> Which explains why there is no error while binding the socket in
> native/zero-copy mode.

It is probably the intention that if both the bonded devices support a
feature, then the bonding device will too. I just saw that the bonding
device did not implement xsk_wakeup which is used by zero-copy, so
zero-copy is not really supported so that support should not be
advertised. The code in AF_XDP tests for zero-copy support this way:

if ((netdev->xdp_features & NETDEV_XDP_ACT_ZC) != NETDEV_XDP_ACT_ZC) {
    err = -EOPNOTSUPP;
    goto err_unreg_pool;
}

So there are some things needed in the bonding driver to make
zero-copy work. Might not be much though. But your problem is with
XDP_DRV and copy mode, so let us start there.

> void bond_xdp_set_features(struct net_device *bond_dev)
> {
> ..
>     bond_for_each_slave(bond, slave, iter)
>         val &= slave->dev->xdp_features;
>     xdp_set_features_flag(bond_dev, val);
> }
>
> # ../xdp-loader/xdp-loader features bond0
> NETDEV_XDP_ACT_BASIC:           yes
> NETDEV_XDP_ACT_REDIRECT:        yes
> NETDEV_XDP_ACT_NDO_XMIT:        no
> NETDEV_XDP_ACT_XSK_ZEROCOPY:    yes
> NETDEV_XDP_ACT_HW_OFFLOAD:      no
> NETDEV_XDP_ACT_RX_SG:           no
> NETDEV_XDP_ACT_NDO_XMIT_SG:     no
>
> > > Another thing I notice is that other XDP programs attached to bond
> > > interface with targets like DROP, REDIRECT to other interface works
> > > and perform better than AF_XDP (skb) based. Does this mean that these
> > > are not allocating skb?
> >
> > I am not surprised that AF_XDP in copy is slower than XDP_REDIRECT.
> > The packet has to be copied out to user-space then copied into the
> > kernel again, something that is not needed in the XDP_REDIRECT case.
> > If you were using zero-copy, on the other hand, it would be faster
> > with AF_XDP. But the bonding interface does not support zero-copy, so
> > not an option.
> >
>
> Just to put forth the pps numbers with the above mentioned single port
> in different modes and a comparison to the bond interface.
> Test is using pktgen pumping 64 byte packets on a single flow.
>
> Single AF_XDP sock on a single NIC queue-
>   AF_XDP rxdrop        PPS    CPU-SI*   CPU-xdpsock   Command
>  ══════════════════════════════════════════════════════════
>   ZC                            14M      65%        35%
> ./xdpsock -r -i ens1f0 -q 5 -p -n 1 -N -z
>   XDP_DRV/COPY     10M     100%       23%                ./xdpsock -r
> -i ens1f0 -q 5 -p -n 1 -N -c
>   SKB_MODE            2.2M     100%       62%                ./xdpsock
> -r -i ens1f0 -q 5 -p -n 1 -S
> * CPU receiving the packet
> In the above tests when using ZC and XDP_DRV/COPY, is this SI usage as
> expected? Especially in ZC mode. Is it majorly because of the BPF
> program running in non-HW offloaded mode? Don't have a NIC which can
> run BPF in offloaded mode so I cannot compare it.

I get about 25 - 30 Mpps at 100% CPU load on my system, but I have a
100G card and you are maxing out your 10G card at 65% and 14M. So yes,
sounds reasonable. HW offload cannot be used with AF_XDP. You need to
do the redirect in the CPU for it to work. If you want to know where
time is spent use "perf top". The biggest chunk of time is spent in
the XDP_REDIRECT operation, but there are many other time thiefs too.

> The XDP_DROP target using xdp-bench tool (from xdp-tools) on the same NIC port-
>   xdp-bench                PPS       CPU-SI*   Command
>  ═══════════════════════════════════════════════
>   drop, no-touch         14M           41%      ./xdp-bench drop -p
> no-touch ens1f0 -e
>   drop, read-data        14M           55%      ./xdp-bench drop -p
> read-data ens1f0 -e
>   drop, parse-ip          14M           58%      ./xdp-bench drop -p
> parse-ip ens1f0 -e
> * CPU receiving the packet
>
> The similar tests on bond interface (above mentioned 2 ports bonded)-
>  AF_XDP rxdrop       PPS   CPU-SI*   CPU-xdpsock   Command
>  ══════════════════════════════════════════════════════════
>   ZC                           X         X              X
>       ./xdpsock -r -i bond0 -q 0 -p -n 1 -N -z
>   XDP_DRV/COPY    X         X              X
> ./xdpsock -r -i bond0 -q 0 -p -n 1 -N -c
>   SKB_MODE            2M      100%        55%              ./xdpsock
> -r -i bond0 -q 0 -p -n 1 -S
> * CPU receiving the packet
>
>   xdp-bench            PPS     CPU-SI*   Command
>  ═══════════════════════════════════════════════
>   drop, no-touch     10.9M    33%         ./xdp-bench drop -p no-touch
> bond0 -e
>   drop, read-data    10.9M    44%         ./xdp-bench drop -p
> read-data bond0 -e
>   drop, parse-ip       10.9M   47%         ./xdp-bench drop -p
> parse-ip bond0 -e
> * CPU receiving the packet
>
>
> > > Kindly share your thoughts and advice.
> > >
> > > Thanks,
> > > Prashant
> > >





[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux