> Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> writes: > > >> Lorenzo Bianconi <lorenzo@xxxxxxxxxx> writes: > >> > >> > Introduce veth_convert_xdp_buff_from_skb routine in order to > >> > convert a non-linear skb into a xdp buffer. If the received skb > >> > is cloned or shared, veth_convert_xdp_buff_from_skb will copy it > >> > in a new skb composed by order-0 pages for the linear and the > >> > fragmented area. Moreover veth_convert_xdp_buff_from_skb guarantees > >> > we have enough headroom for xdp. > >> > This is a preliminary patch to allow attaching xdp programs with frags > >> > support on veth devices. > >> > > >> > Signed-off-by: Lorenzo Bianconi <lorenzo@xxxxxxxxxx> > >> > >> It's cool that we can do this! A few comments below: > > > > Hi Toke, > > > > thx for the review :) > > > > [...] > > > >> > +static int veth_convert_xdp_buff_from_skb(struct veth_rq *rq, > >> > + struct xdp_buff *xdp, > >> > + struct sk_buff **pskb) > >> > +{ > >> > >> nit: It's not really "converting" and skb into an xdp_buff, since the > >> xdp_buff lives on the stack; so maybe 'veth_init_xdp_buff_from_skb()'? > > > > I kept the previous naming convention used for xdp_convert_frame_to_buff() > > (my goal would be to move it in xdp.c and reuse this routine for the > > generic-xdp use case) but I am fine with > > veth_init_xdp_buff_from_skb(). > > Consistency is probably good, but right now we have functions of the > form 'xdp_convert_X_to_Y()' and 'xdp_update_Y_from_X()'. So to follow > that you'd have either 'veth_update_xdp_buff_from_skb()' or > 'veth_convert_skb_to_xdp_buff()' :) ack, I am fine with veth_convert_skb_to_xdp_buff() > > >> > + struct sk_buff *skb = *pskb; > >> > + u32 frame_sz; > >> > > >> > if (skb_shared(skb) || skb_head_is_locked(skb) || > >> > - skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) { > >> > + skb_shinfo(skb)->nr_frags) { > >> > >> So this always clones the skb if it has frags? Is that really needed? > > > > if we look at skb_cow_data(), paged area is always considered not writable > > Ah, right, did not know that. Seems a bit odd, but OK. > > >> Also, there's a lot of memory allocation and copying going on here; have > >> you measured the performance? > > > > even in the previous implementation we always reallocate the skb if the > > conditions above are verified so I do not expect any difference in the single > > buffer use-case but I will run some performance tests. > > No, I wouldn't expect any difference for the single-buffer case, but I > would also be interested in how big the overhead is of having to copy > the whole jumbo-frame? oh ok, I got what you mean. I guess we can compare the tcp throughput for the legacy skb mode (when no program is attached on the veth pair) and xdp mode (when we load a simple xdp program that just returns xdp_pass) when jumbo frames are enabled. I would expect a performance penalty but let's see. > > BTW, just noticed one other change - before we had: > > > - headroom = skb_headroom(skb) - mac_len; > > if (skb_shared(skb) || skb_head_is_locked(skb) || > > - skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) { > > > And in your patch that becomes: > > > + } else if (skb_headroom(skb) < XDP_PACKET_HEADROOM && > > + pskb_expand_head(skb, VETH_XDP_HEADROOM, 0, GFP_ATOMIC)) { > > + goto drop; > > > So the mac_len subtraction disappeared; that seems wrong? we call __skb_push before running veth_convert_xdp_buff_from_skb() in veth_xdp_rcv_skb(). > > >> > + > >> > + if (xdp_buff_has_frags(&xdp)) > >> > + skb->data_len = skb_shinfo(skb)->xdp_frags_size; > >> > + else > >> > + skb->data_len = 0; > >> > >> We can remove entire frags using xdp_adjust_tail, right? Will that get > >> propagated in the right way to the skb frags due to the dual use of > >> skb_shared_info, or? > > > > bpf_xdp_frags_shrink_tail() can remove entire frags and it will modify > > metadata contained in the skb_shared_info (e.g. nr_frags or the frag > > size of the given page). We should consider the data_len field in this > > case. Agree? > > Right, that's what I assumed; makes sense. But adding a comment > mentioning this above the update of data_len might be helpful? :) ack, will do. Regards, Lorenzo > > -Toke >
Attachment:
signature.asc
Description: PGP signature