Re: [PATCH v5 bpf-next 01/14] xdp: introduce mb in xdp_buff/xdp_frame

Alexander Duyck <alexander.duyck@xxxxxxxxx> · Mon, 7 Dec 2020 13:16:19 -0800

On Mon, Dec 7, 2020 at 8:36 AM Lorenzo Bianconi <lorenzo@xxxxxxxxxx> wrote:
>
> Introduce multi-buffer bit (mb) in xdp_frame/xdp_buffer data structure
> in order to specify if this is a linear buffer (mb = 0) or a multi-buffer
> frame (mb = 1). In the latter case the shared_info area at the end of the
> first buffer is been properly initialized to link together subsequent
> buffers.
>
> Signed-off-by: Lorenzo Bianconi <lorenzo@xxxxxxxxxx>
> ---
>  include/net/xdp.h | 8 ++++++--
>  net/core/xdp.c    | 1 +
>  2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/xdp.h b/include/net/xdp.h
> index 700ad5db7f5d..70559720ff44 100644
> --- a/include/net/xdp.h
> +++ b/include/net/xdp.h
> @@ -73,7 +73,8 @@ struct xdp_buff {
>         void *data_hard_start;
>         struct xdp_rxq_info *rxq;
>         struct xdp_txq_info *txq;
> -       u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/
> +       u32 frame_sz:31; /* frame size to deduce data_hard_end/reserved tailroom*/
> +       u32 mb:1; /* xdp non-linear buffer */
>  };
>

If we are really going to do something like this I say we should just
rip a swath of bits out instead of just grabbing one. We are already
cutting the size down then we should just decide on the minimum size
that is acceptable and just jump to that instead of just stealing one
bit at a time. It looks like we already have differences between the
size here and frame_size in xdp_frame.

If we have to steal a bit why not look at something like one of the
lower 2/3 bits in rxq? You could then do the same thing using dev_rx
in a similar fashion instead of stealing from a bit that is likely to
be used in multiple spots and modifying like this adds extra overhead
to?

>  /* Reserve memory area at end-of data area.
> @@ -97,7 +98,8 @@ struct xdp_frame {
>         u16 len;
>         u16 headroom;
>         u32 metasize:8;
> -       u32 frame_sz:24;
> +       u32 frame_sz:23;
> +       u32 mb:1; /* xdp non-linear frame */
>         /* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
>          * while mem info is valid on remote CPU.
>          */

Again, if we are just going to start shrinking frame_sz we should
probably define where we are going to limit ourselves to and just go
straight to that value. Otherwise we are going to start jeopardizing
backwards compatibility at some point when we steal too many bits.

> @@ -154,6 +156,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)
>         xdp->data_end = frame->data + frame->len;
>         xdp->data_meta = frame->data - frame->metasize;
>         xdp->frame_sz = frame->frame_sz;
> +       xdp->mb = frame->mb;
>  }
>
>  static inline
> @@ -180,6 +183,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp,
>         xdp_frame->headroom = headroom - sizeof(*xdp_frame);
>         xdp_frame->metasize = metasize;
>         xdp_frame->frame_sz = xdp->frame_sz;
> +       xdp_frame->mb = xdp->mb;
>
>         return 0;
>  }
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 17ffd33c6b18..79dd45234e4d 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -509,6 +509,7 @@ struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp)
>         xdpf->headroom = 0;
>         xdpf->metasize = metasize;
>         xdpf->frame_sz = PAGE_SIZE;
> +       xdpf->mb = xdp->mb;
>         xdpf->mem.type = MEM_TYPE_PAGE_ORDER0;
>
>         xsk_buff_free(xdp);

At this point all you are doing is moving a meaningless flag. I would
think we would want to wait on adding this code until there is some
meaning behind the bit since it doesn't make sense to convert a
multi-buffer xdp_frame to a buffer. If nothing else it really feels
like there is some exception handling missing here as I would expect
that conversion of a multi-buffer frame should fail since you cannot
convert something from multiple to a single without having to redo
allocations and/or linearizing it.