Re: Redirect to AF_XDP socket not working with bond interface in native mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 21 Dec 2023 at 13:39, Prashant Batra <prbatra.mail@xxxxxxxxx> wrote:
>
> On Wed, Dec 20, 2023 at 1:54 PM Magnus Karlsson
> <magnus.karlsson@xxxxxxxxx> wrote:
> >
> > On Tue, 19 Dec 2023 at 21:18, Prashant Batra <prbatra.mail@xxxxxxxxx> wrote:
> > >
> > > Thanks for your response. My comments inline.
> > >
> > > On Tue, Dec 19, 2023 at 7:17 PM Magnus Karlsson
> > > <magnus.karlsson@xxxxxxxxx> wrote:
> > > >
> > > > On Tue, 19 Dec 2023 at 11:46, Prashant Batra <prbatra.mail@xxxxxxxxx> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I am new to XDP and exploring it's working with different interface
> > > > > types supported in linux. One of my use cases is to be able to receive
> > > > > packets from the bond interface.
> > > > > I used xdpsock sample program specifying the bond interface as the
> > > > > input interface. However the packets received on the bond interface
> > > > > are not handed over to the socket by the kernel if the socket is bound
> > > > > in native mode. The packets are neither being passed to the kernel.
> > > > > Note that the socket creation does succeed.
> > > > > In skb mode this works and I am able to receive packets in the
> > > > > userspace. But in skb mode as expected the performance is not that
> > > > > great.
> > > > >
> > > > > Is AF_XDP sockets on bond not supported in native mode? Or since the
> > > > > packet has be to be handed over to the bond driver post reception on
> > > > > the phy port, a skb allocation and copy to it is indeed a must?
> > > >
> > > > I have never tried a bonding interface with AF_XDP, so it might not
> > > > work. Can you trace the packet to see where it is being dropped in
> > > > native mode? There are no modifications needed to an XDP_REDIRECT
> > > > enabled driver to support AF_XDP in XDP_DRV / copy mode. What NICs are
> > > > you using?
> > > >
> > > I will trace the packet and get back.
> > > The bond is over 2 physical ports part of the Intel NIC card. Those are-
> > > b3:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
> > > SFI/SFP+ Network Connection (rev 01)
> > > b3:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
> > > SFI/SFP+ Network Connection (rev 01)
> > >
> > > Bonding algo is 802.3ad
> > >
> > > CPU is Intel Xeon Gold 3.40GHz
> > >
> > > NIC Driver
> > > # ethtool -i ens1f0
> > > driver: ixgbe
> > > version: 5.14.0-362.13.1.el9_3
> >
> > Could you please try with the latest kernel 6.7? 5.14 is quite old and
> > a lot of things have happened since then.
> >
> I tried with kernel 6.6.8-1.el9.elrepo.x86_64. I still see the same issue.

OK, good to know. Have you managed to trace where the packet is lost?

> > > Features
> > > # xdp-loader features ens1f0
> > > NETDEV_XDP_ACT_BASIC:           yes
> > > NETDEV_XDP_ACT_REDIRECT:        yes
> > > NETDEV_XDP_ACT_NDO_XMIT:        no
> > > NETDEV_XDP_ACT_XSK_ZEROCOPY:    yes
> > > NETDEV_XDP_ACT_HW_OFFLOAD:      no
> > > NETDEV_XDP_ACT_RX_SG:           no
> > > NETDEV_XDP_ACT_NDO_XMIT_SG:     no
> > >
> > > CPU is
> > >
> > > Interesting thing is that the bond0 does advertise both native and ZC
> > > mode. That's because the features are copied from the slave device.
> > > Which explains why there is no error while binding the socket in
> > > native/zero-copy mode.
> >
> > It is probably the intention that if both the bonded devices support a
> > feature, then the bonding device will too. I just saw that the bonding
> > device did not implement xsk_wakeup which is used by zero-copy, so
> > zero-copy is not really supported so that support should not be
> > advertised. The code in AF_XDP tests for zero-copy support this way:
> >
> > if ((netdev->xdp_features & NETDEV_XDP_ACT_ZC) != NETDEV_XDP_ACT_ZC) {
> >     err = -EOPNOTSUPP;
> >     goto err_unreg_pool;
> > }
> >
> > So there are some things needed in the bonding driver to make
> > zero-copy work. Might not be much though. But your problem is with
> > XDP_DRV and copy mode, so let us start there.
> >
> > > void bond_xdp_set_features(struct net_device *bond_dev)
> > > {
> > > ..
> > >     bond_for_each_slave(bond, slave, iter)
> > >         val &= slave->dev->xdp_features;
> > >     xdp_set_features_flag(bond_dev, val);
> > > }
> > >
> > > # ../xdp-loader/xdp-loader features bond0
> > > NETDEV_XDP_ACT_BASIC:           yes
> > > NETDEV_XDP_ACT_REDIRECT:        yes
> > > NETDEV_XDP_ACT_NDO_XMIT:        no
> > > NETDEV_XDP_ACT_XSK_ZEROCOPY:    yes
> > > NETDEV_XDP_ACT_HW_OFFLOAD:      no
> > > NETDEV_XDP_ACT_RX_SG:           no
> > > NETDEV_XDP_ACT_NDO_XMIT_SG:     no
> > >
> > > > > Another thing I notice is that other XDP programs attached to bond
> > > > > interface with targets like DROP, REDIRECT to other interface works
> > > > > and perform better than AF_XDP (skb) based. Does this mean that these
> > > > > are not allocating skb?
> > > >
> > > > I am not surprised that AF_XDP in copy is slower than XDP_REDIRECT.
> > > > The packet has to be copied out to user-space then copied into the
> > > > kernel again, something that is not needed in the XDP_REDIRECT case.
> > > > If you were using zero-copy, on the other hand, it would be faster
> > > > with AF_XDP. But the bonding interface does not support zero-copy, so
> > > > not an option.
> > > >
> > >
> > > Just to put forth the pps numbers with the above mentioned single port
> > > in different modes and a comparison to the bond interface.
> > > Test is using pktgen pumping 64 byte packets on a single flow.
> > >
> > > Single AF_XDP sock on a single NIC queue-
> > >   AF_XDP rxdrop        PPS    CPU-SI*   CPU-xdpsock   Command
> > >  ══════════════════════════════════════════════════════════
> > >   ZC                            14M      65%        35%
> > > ./xdpsock -r -i ens1f0 -q 5 -p -n 1 -N -z
> > >   XDP_DRV/COPY     10M     100%       23%                ./xdpsock -r
> > > -i ens1f0 -q 5 -p -n 1 -N -c
> > >   SKB_MODE            2.2M     100%       62%                ./xdpsock
> > > -r -i ens1f0 -q 5 -p -n 1 -S
> > > * CPU receiving the packet
> > > In the above tests when using ZC and XDP_DRV/COPY, is this SI usage as
> > > expected? Especially in ZC mode. Is it majorly because of the BPF
> > > program running in non-HW offloaded mode? Don't have a NIC which can
> > > run BPF in offloaded mode so I cannot compare it.
> >
> > I get about 25 - 30 Mpps at 100% CPU load on my system, but I have a
> > 100G card and you are maxing out your 10G card at 65% and 14M. So yes,
> > sounds reasonable. HW offload cannot be used with AF_XDP. You need to
> > do the redirect in the CPU for it to work. If you want to know where
> > time is spent use "perf top". The biggest chunk of time is spent in
> > the XDP_REDIRECT operation, but there are many other time thiefs too.
> >
> > > The XDP_DROP target using xdp-bench tool (from xdp-tools) on the same NIC port-
> > >   xdp-bench                PPS       CPU-SI*   Command
> > >  ═══════════════════════════════════════════════
> > >   drop, no-touch         14M           41%      ./xdp-bench drop -p
> > > no-touch ens1f0 -e
> > >   drop, read-data        14M           55%      ./xdp-bench drop -p
> > > read-data ens1f0 -e
> > >   drop, parse-ip          14M           58%      ./xdp-bench drop -p
> > > parse-ip ens1f0 -e
> > > * CPU receiving the packet
> > >
> > > The similar tests on bond interface (above mentioned 2 ports bonded)-
> > >  AF_XDP rxdrop       PPS   CPU-SI*   CPU-xdpsock   Command
> > >  ══════════════════════════════════════════════════════════
> > >   ZC                           X         X              X
> > >       ./xdpsock -r -i bond0 -q 0 -p -n 1 -N -z
> > >   XDP_DRV/COPY    X         X              X
> > > ./xdpsock -r -i bond0 -q 0 -p -n 1 -N -c
> > >   SKB_MODE            2M      100%        55%              ./xdpsock
> > > -r -i bond0 -q 0 -p -n 1 -S
> > > * CPU receiving the packet
> > >
> > >   xdp-bench            PPS     CPU-SI*   Command
> > >  ═══════════════════════════════════════════════
> > >   drop, no-touch     10.9M    33%         ./xdp-bench drop -p no-touch
> > > bond0 -e
> > >   drop, read-data    10.9M    44%         ./xdp-bench drop -p
> > > read-data bond0 -e
> > >   drop, parse-ip       10.9M   47%         ./xdp-bench drop -p
> > > parse-ip bond0 -e
> > > * CPU receiving the packet
> > >
> > >
> > > > > Kindly share your thoughts and advice.
> > > > >
> > > > > Thanks,
> > > > > Prashant
> > > > >





[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux