Re: [PATCH v2 bpf-next 00/14] xsk: stop NAPI Rx processing on full XSK RQ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Anyone from Intel? Maciej, Björn, Magnus?

Does anyone else have anything to say? IMO, calling XDP for the same
packet multiple times is a bug, we should agree on some sane solution.

On Thu, 2022-08-18 at 14:26 +0300, Maxim Mikityanskiy wrote:
> Hi Maciej,
> 
> On Wed, 2022-04-13 at 17:30 +0200, Maciej Fijalkowski wrote:
> > v2:
> > - add likely for internal redirect return codes in ice and ixgbe
> >   (Jesper)
> > - do not drop the buffer that head pointed to at full XSK RQ
> > (Maxim)
> 
> I found an issue with this approach. If you don't drop that packet,
> you'll need to run the XDP program for the same packet again. If the
> XDP program is anything more complex than "redirect-everything-to-
> XSK",
> it will get confused, for example, if it tracks any state or counts
> anything.
> 
> We can't check whether there is enough space in the XSK RX ring
> before
> running the XDP program, as we don't know in advance which XSK socket
> will be selected.
> 
> We can't store bpf_redirect_info across NAPI calls to avoid running
> the
> XDP program second time, as bpf_redirect_info is protected by RCU and
> the assumption that the whole XDP_REDIRECT operation happens within
> one
> NAPI cycle.
> 
> I see the following options:
> 
> 1. Drop the packet when an overflow happens. The problem is that it
> will happen systematically, but it's still better than the old
> behavior
> (drop mulitple packets when an overflow happens and hog the CPU).
> 
> 2. Make this feature opt-in. If the application opts in, it
> guarantees
> to take measures to handle duplicate packets in XDP properly.
> 
> 3. Require the XDP program to indicate it supports being called
> multiple times for the same packet and pass a flag to it if it's a
> repeated run. Drop the packet in the driver if the XDP program
> doesn't
> indicate this support. The indication can be done similar to how it's
> implemented for XDP multi buffer.
> 
> Thoughts? Other options?
> 
> Thanks,
> Max
> 
> > - terminate Rx loop only when need_wakeup feature is enabled
> > (Maxim)
> > - reword from 'stop softirq processing' to 'stop NAPI Rx
> > processing'
> > - s/ENXIO/EINVAL in mlx5 and stmmac's ndo_xsk_wakeup to keep it
> >   consitent with Intel's drivers (Maxim)
> > - include Jesper's Acks
> > 
> > Hi!
> > 
> > This is a revival of Bjorn's idea [0] to break NAPI loop when XSK
> > Rx
> > queue gets full which in turn makes it impossible to keep on
> > successfully producing descriptors to XSK Rx ring. By breaking out
> > of
> > the driver side immediately we will give the user space opportunity
> > for
> > consuming descriptors from XSK Rx ring and therefore provide room
> > in the
> > ring so that HW Rx -> XSK Rx redirection can be done.
> > 
> > Maxim asked and Jesper agreed on simplifying Bjorn's original API
> > used
> > for detecting the event of interest, so let's just simply check for
> > -ENOBUFS within Intel's ZC drivers after an attempt to redirect a
> > buffer
> > to XSK Rx. No real need for redirect API extension.
> > 
> > One might ask why it is still relevant even after having proper
> > busy
> > poll support in place - here is the justification.
> > 
> > For xdpsock that was:
> > - run for l2fwd scenario,
> > - app/driver processing took place on the same core in busy poll
> >   with 2048 budget,
> > - HW ring sizes Tx 256, Rx 2048,
> > 
> > this work improved throughput by 78% and reduced Rx queue full
> > statistic
> > bump by 99%.
> > 
> > For testing ice, make sure that you have [1] present on your side.
> > 
> > This set, besides the work described above, carries also
> > improvements
> > around return codes in various XSK paths and lastly a minor
> > optimization
> > for xskq_cons_has_entries(), a helper that might be used when XSK
> > Rx
> > batching would make it to the kernel.
> > 
> > Link to v1 and discussion around it is at [2].
> > 
> > Thanks!
> > MF
> > 
> > [0]:
> > https://lore.kernel.org/bpf/20200904135332.60259-1-bjorn.topel@xxxxxxxxx/
> > [1]:
> > https://lore.kernel.org/netdev/20220317175727.340251-1-maciej.fijalkowski@xxxxxxxxx/
> > [2]:
> > https://lore.kernel.org/bpf/5864171b-1e08-1b51-026e-5f09b8945076@xxxxxxxxxx/T/
> > 
> > Björn Töpel (1):
> >   xsk: improve xdp_do_redirect() error codes
> > 
> > Maciej Fijalkowski (13):
> >   xsk: diversify return codes in xsk_rcv_check()
> >   ice: xsk: decorate ICE_XDP_REDIR with likely()
> >   ixgbe: xsk: decorate IXGBE_XDP_REDIR with likely()
> >   ice: xsk: terminate Rx side of NAPI when XSK Rx queue gets full
> >   i40e: xsk: terminate Rx side of NAPI when XSK Rx queue gets full
> >   ixgbe: xsk: terminate Rx side of NAPI when XSK Rx queue gets full
> >   ice: xsk: diversify return values from xsk_wakeup call paths
> >   i40e: xsk: diversify return values from xsk_wakeup call paths
> >   ixgbe: xsk: diversify return values from xsk_wakeup call paths
> >   mlx5: xsk: diversify return values from xsk_wakeup call paths
> >   stmmac: xsk: diversify return values from xsk_wakeup call paths
> >   ice: xsk: avoid refilling single Rx descriptors
> >   xsk: drop ternary operator from xskq_cons_has_entries
> > 
> >  .../ethernet/intel/i40e/i40e_txrx_common.h    |  1 +
> >  drivers/net/ethernet/intel/i40e/i40e_xsk.c    | 38 ++++++++-----
> >  drivers/net/ethernet/intel/ice/ice_txrx.h     |  1 +
> >  drivers/net/ethernet/intel/ice/ice_xsk.c      | 53 ++++++++++++---
> > ----
> >  .../ethernet/intel/ixgbe/ixgbe_txrx_common.h  |  1 +
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c  | 52 ++++++++++-----
> > ---
> >  .../ethernet/mellanox/mlx5/core/en/xsk/tx.c   |  2 +-
> >  .../net/ethernet/stmicro/stmmac/stmmac_main.c |  4 +-
> >  net/xdp/xsk.c                                 |  4 +-
> >  net/xdp/xsk_queue.h                           |  4 +-
> >  10 files changed, 99 insertions(+), 61 deletions(-)
> > 
> 





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux