On Tue, 12 Jul 2022 21:52:25 +0100 Pavel Begunkov wrote: > Even when zerocopy transmission is requested and possible, > __ip_append_data() will still copy a small chunk of data just because it > allocated some extra linear space (e.g. 148 bytes). It wastes CPU cycles > on copy and iter manipulations and also misalignes potentially aligned > data. Avoid such coies. And as a bonus we can allocate smaller skb. s/coies/copies/ can fix when applying > > Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx> > --- > net/ipv4/ip_output.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c > index 00b4bf26fd93..581d1e233260 100644 > --- a/net/ipv4/ip_output.c > +++ b/net/ipv4/ip_output.c > @@ -969,7 +969,6 @@ static int __ip_append_data(struct sock *sk, > struct inet_sock *inet = inet_sk(sk); > struct ubuf_info *uarg = NULL; > struct sk_buff *skb; > - > struct ip_options *opt = cork->opt; > int hh_len; > int exthdrlen; > @@ -977,6 +976,7 @@ static int __ip_append_data(struct sock *sk, > int copy; > int err; > int offset = 0; > + bool zc = false; > unsigned int maxfraglen, fragheaderlen, maxnonfragsize; > int csummode = CHECKSUM_NONE; > struct rtable *rt = (struct rtable *)cork->dst; > @@ -1025,6 +1025,7 @@ static int __ip_append_data(struct sock *sk, > if (rt->dst.dev->features & NETIF_F_SG && > csummode == CHECKSUM_PARTIAL) { > paged = true; > + zc = true; > } else { > uarg->zerocopy = 0; > skb_zcopy_set(skb, uarg, &extra_uref); > @@ -1091,9 +1092,12 @@ static int __ip_append_data(struct sock *sk, > (fraglen + alloc_extra < SKB_MAX_ALLOC || > !(rt->dst.dev->features & NETIF_F_SG))) > alloclen = fraglen; > - else { > + else if (!zc) { > alloclen = min_t(int, fraglen, MAX_HEADER); Willem, I think this came in with your GSO work, is there a reason we use MAX_HEADER here? I thought MAX_HEADER is for headers (i.e. more or less to be reserved) not for the min amount of data to be included. I wanna make sure we're not missing something about GSO here. Otherwise I don't think we need the extra branch but that can be a follow up. > pagedlen = fraglen - alloclen; > + } else { > + alloclen = fragheaderlen + transhdrlen; > + pagedlen = datalen - transhdrlen; > } > > alloclen += alloc_extra;