On Wed, 24 Jan 2024 at 20:17, Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> wrote: > > XDP multi-buffer support introduced XDP_FLAGS_HAS_FRAGS flag that is > used by drivers to notify data path whether xdp_buff contains fragments > or not. Data path looks up mentioned flag on first buffer that occupies > the linear part of xdp_buff, so drivers only modify it there. This is > sufficient for SKB and XDP_DRV modes as usually xdp_buff is allocated on > stack or it resides within struct representing driver's queue and > fragments are carried via skb_frag_t structs. IOW, we are dealing with > only one xdp_buff. > > ZC mode though relies on list of xdp_buff structs that is carried via > xsk_buff_pool::xskb_list, so ZC data path has to make sure that > fragments do *not* have XDP_FLAGS_HAS_FRAGS set. Otherwise, > xsk_buff_free() could misbehave if it would be executed against xdp_buff > that carries a frag with XDP_FLAGS_HAS_FRAGS flag set. Such scenario can > take place when within supplied XDP program bpf_xdp_adjust_tail() is > used with negative offset that would in turn release the tail fragment > from multi-buffer frame. > > Calling xsk_buff_free() on tail fragment with XDP_FLAGS_HAS_FRAGS would > result in releasing all the nodes from xskb_list that were produced by > driver before XDP program execution, which is not what is intended - > only tail fragment should be deleted from xskb_list and then it should > be put onto xsk_buff_pool::free_list. Such multi-buffer frame will never > make it up to user space, so from AF_XDP application POV there would be > no traffic running, however due to free_list getting constantly new > nodes, driver will be able to feed HW Rx queue with recycled buffers. > Bottom line is that instead of traffic being redirected to user space, > it would be continuously dropped. > > To fix this, let us clear the mentioned flag on xsk_buff_pool side > during xdp_buff initialization, which is what should have been done > right from the start of XSK multi-buffer support. Thanks! Acked-by: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> > Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support") > Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support") > Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> > --- > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 1 - > drivers/net/ethernet/intel/ice/ice_xsk.c | 1 - > include/net/xdp_sock_drv.h | 1 + > net/xdp/xsk_buff_pool.c | 1 + > 4 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > index af7d5fa6cdc1..82aca0d16a3e 100644 > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > @@ -498,7 +498,6 @@ int i40e_clean_rx_irq_zc(struct i40e_ring *rx_ring, int budget) > xdp_res = i40e_run_xdp_zc(rx_ring, first, xdp_prog); > i40e_handle_xdp_result_zc(rx_ring, first, rx_desc, &rx_packets, > &rx_bytes, xdp_res, &failure); > - first->flags = 0; > next_to_clean = next_to_process; > if (failure) > break; > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c > index 5d1ae8e4058a..d9073a618ad6 100644 > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c > @@ -895,7 +895,6 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget) > > if (!first) { > first = xdp; > - xdp_buff_clear_frags_flag(first); > } else if (ice_add_xsk_frag(rx_ring, first, xdp, size)) { > break; > } > diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h > index 526c1e7f505e..9819e2af0378 100644 > --- a/include/net/xdp_sock_drv.h > +++ b/include/net/xdp_sock_drv.h > @@ -164,6 +164,7 @@ static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size) > xdp->data = xdp->data_hard_start + XDP_PACKET_HEADROOM; > xdp->data_meta = xdp->data; > xdp->data_end = xdp->data + size; > + xdp->flags = 0; > } > > static inline dma_addr_t xsk_buff_raw_get_dma(struct xsk_buff_pool *pool, > diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c > index 28711cc44ced..ce60ecd48a4d 100644 > --- a/net/xdp/xsk_buff_pool.c > +++ b/net/xdp/xsk_buff_pool.c > @@ -555,6 +555,7 @@ struct xdp_buff *xp_alloc(struct xsk_buff_pool *pool) > > xskb->xdp.data = xskb->xdp.data_hard_start + XDP_PACKET_HEADROOM; > xskb->xdp.data_meta = xskb->xdp.data; > + xskb->xdp.flags = 0; > > if (pool->dma_need_sync) { > dma_sync_single_range_for_device(pool->dev, xskb->dma, 0, > -- > 2.34.1 > >