> > Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> writes: > > > > >> Lorenzo Bianconi <lorenzo@xxxxxxxxxx> writes: > > >> > > >> > Introduce veth_convert_xdp_buff_from_skb routine in order to > > >> > convert a non-linear skb into a xdp buffer. If the received skb > > >> > is cloned or shared, veth_convert_xdp_buff_from_skb will copy it > > >> > in a new skb composed by order-0 pages for the linear and the > > >> > fragmented area. Moreover veth_convert_xdp_buff_from_skb guarantees > > >> > we have enough headroom for xdp. > > >> > This is a preliminary patch to allow attaching xdp programs with frags > > >> > support on veth devices. > > >> > > > >> > Signed-off-by: Lorenzo Bianconi <lorenzo@xxxxxxxxxx> > > >> > > >> It's cool that we can do this! A few comments below: > > > > > > Hi Toke, > > > > > > thx for the review :) > > > > > > [...] > > > > > >> > +static int veth_convert_xdp_buff_from_skb(struct veth_rq *rq, > > >> > + struct xdp_buff *xdp, > > >> > + struct sk_buff **pskb) > > >> > +{ > > >> > > >> nit: It's not really "converting" and skb into an xdp_buff, since the > > >> xdp_buff lives on the stack; so maybe 'veth_init_xdp_buff_from_skb()'? > > > > > > I kept the previous naming convention used for xdp_convert_frame_to_buff() > > > (my goal would be to move it in xdp.c and reuse this routine for the > > > generic-xdp use case) but I am fine with > > > veth_init_xdp_buff_from_skb(). > > > > Consistency is probably good, but right now we have functions of the > > form 'xdp_convert_X_to_Y()' and 'xdp_update_Y_from_X()'. So to follow > > that you'd have either 'veth_update_xdp_buff_from_skb()' or > > 'veth_convert_skb_to_xdp_buff()' :) > > ack, I am fine with veth_convert_skb_to_xdp_buff() > > > > > >> > + struct sk_buff *skb = *pskb; > > >> > + u32 frame_sz; > > >> > > > >> > if (skb_shared(skb) || skb_head_is_locked(skb) || > > >> > - skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) { > > >> > + skb_shinfo(skb)->nr_frags) { > > >> > > >> So this always clones the skb if it has frags? Is that really needed? > > > > > > if we look at skb_cow_data(), paged area is always considered not writable > > > > Ah, right, did not know that. Seems a bit odd, but OK. > > > > >> Also, there's a lot of memory allocation and copying going on here; have > > >> you measured the performance? > > > > > > even in the previous implementation we always reallocate the skb if the > > > conditions above are verified so I do not expect any difference in the single > > > buffer use-case but I will run some performance tests. > > > > No, I wouldn't expect any difference for the single-buffer case, but I > > would also be interested in how big the overhead is of having to copy > > the whole jumbo-frame? > > oh ok, I got what you mean. I guess we can compare the tcp throughput for > the legacy skb mode (when no program is attached on the veth pair) and xdp > mode (when we load a simple xdp program that just returns xdp_pass) when > jumbo frames are enabled. I would expect a performance penalty but let's see. I run the tests described above and I got the following results: - skb mode mtu 1500B (TSO/GSO off): ~ 16.8 Gbps - xdp mode mtu 1500B (XDP_PASS): ~ 9.52 Gbps - skb mode mtu 32KB (TSO/GSO off): ~ 41 Gbps - xdp mode mtu 32KB (XDP_PASS): ~ 25 Gbps the (expected) performance penalty ratio (due to the copy) is quite constant Regards, Lorenzo > > > > > BTW, just noticed one other change - before we had: > > > > > - headroom = skb_headroom(skb) - mac_len; > > > if (skb_shared(skb) || skb_head_is_locked(skb) || > > > - skb_is_nonlinear(skb) || headroom < XDP_PACKET_HEADROOM) { > > > > > > And in your patch that becomes: > > > > > + } else if (skb_headroom(skb) < XDP_PACKET_HEADROOM && > > > + pskb_expand_head(skb, VETH_XDP_HEADROOM, 0, GFP_ATOMIC)) { > > > + goto drop; > > > > > > So the mac_len subtraction disappeared; that seems wrong? > > we call __skb_push before running veth_convert_xdp_buff_from_skb() in > veth_xdp_rcv_skb(). > > > > > >> > + > > >> > + if (xdp_buff_has_frags(&xdp)) > > >> > + skb->data_len = skb_shinfo(skb)->xdp_frags_size; > > >> > + else > > >> > + skb->data_len = 0; > > >> > > >> We can remove entire frags using xdp_adjust_tail, right? Will that get > > >> propagated in the right way to the skb frags due to the dual use of > > >> skb_shared_info, or? > > > > > > bpf_xdp_frags_shrink_tail() can remove entire frags and it will modify > > > metadata contained in the skb_shared_info (e.g. nr_frags or the frag > > > size of the given page). We should consider the data_len field in this > > > case. Agree? > > > > Right, that's what I assumed; makes sense. But adding a comment > > mentioning this above the update of data_len might be helpful? :) > > ack, will do. > > Regards, > Lorenzo > > > > > -Toke > >
Attachment:
signature.asc
Description: PGP signature