Re: [PATCH v5 bpf-next 03/14] xdp: add xdp_shared_info data structure

Shay Agroskin <shayagr@xxxxxxxxxx> · Sat, 19 Dec 2020 16:53:57 +0200

Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> writes:

On Mon, 2020-12-07 at 17:32 +0100, Lorenzo Bianconi wrote:
> Introduce xdp_shared_info data structure to contain info 
> about
> "non-linear" xdp frame. xdp_shared_info will alias 
> skb_shared_info
> allowing to keep most of the frags in the same cache-line.
[...]

> +	u16 nr_frags;
> +	u16 data_length; /* paged area length */
> +	skb_frag_t frags[MAX_SKB_FRAGS];

why MAX_SKB_FRAGS ? just use a flexible array member 
skb_frag_t frags[]; 

and enforce size via the n_frags and on the construction of the
tailroom preserved buffer, which is already being done.

this is waste of unnecessary space, at lease by definition of 
the
struct, in your use case you do:
memcpy(frag_list, xdp_sinfo->frags, sizeof(skb_frag_t) * 
num_frags);
And the tailroom space was already preserved for a full 
skb_shinfo.
so i don't see why you need this array to be of a fixed 
MAX_SKB_FRAGS
size.

In order to avoid cache-misses, xdp_shared info is built as a 
variable
on mvneta_rx_swbm() stack and it is written to "shared_info" 
area only on the
last fragment in mvneta_swbm_add_rx_fragment(). I used 
MAX_SKB_FRAGS to be
aligned with skb_shared_info struct but probably we can use even 
a smaller value.
Another approach would be to define two different struct, e.g.

stuct xdp_frag_metadata {
	u16 nr_frags;
	u16 data_length; /* paged area length */
};

struct xdp_frags {
	skb_frag_t frags[MAX_SKB_FRAGS];
};

and then define xdp_shared_info as

struct xdp_shared_info {
	stuct xdp_frag_metadata meta;
	skb_frag_t frags[];
};

In this way we can probably optimize the space. What do you 
think?

We're still reserving ~sizeof(skb_shared_info) bytes at the end of 
the first buffer and it seems like in mvneta code you keep 
updating all three fields (frags, nr_frags and data_length).
Can you explain how the space is optimized by splitting the 
structs please?

> +};
> +
> +static inline struct xdp_shared_info *
>  xdp_get_shared_info_from_buff(struct xdp_buff *xdp)
>  {
> -	return (struct skb_shared_info *)xdp_data_hard_end(xdp);
> +	BUILD_BUG_ON(sizeof(struct xdp_shared_info) >
> +		     sizeof(struct skb_shared_info));
> +	return (struct xdp_shared_info *)xdp_data_hard_end(xdp);
> +}
> +

Back to my first comment, do we have plans to use this tail 
room buffer
for other than frag_list use cases ? what will be the buffer 
format
then ? should we push all new fields to the end of the 
xdp_shared_info
struct ? or deal with this tailroom buffer as a stack ? 
my main concern is that for drivers that don't support frag 
list and
still want to utilize the tailroom buffer for other usecases 
they will
have to skip the first sizeof(xdp_shared_info) so they won't 
break the
stack.

for the moment I do not know if this area is used for other 
purposes.
Do you think there are other use-cases for it?

Saeed, the stack receives skb_shared_info when the frames are 
passed to the stack (skb_add_rx_frag is used to add the whole 
information to skb's shared info), and for XDP_REDIRECT use case, 
it doesn't seem like all drivers check page's tailroom for more 
information anyway (ena doesn't at least).
Can you please explain what do you mean by "break the stack"?

Thanks, Shay

[...]