Willem de Bruijn wrote: > Willem de Bruijn wrote: > > Richard Gobert wrote: > > > Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp: > > > additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the > > > complete phase of gro. The functions always return skb->network_header, > > > which in the case of encapsulated packets at the gro complete phase, is > > > always set to the innermost L3 of the packet. That means that calling > > > {ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in > > > gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_ > > > L3/L4 may return an unexpected value. > > > > > > This incorrect usage leads to a bug in GRO's UDP socket lookup. > > > udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These > > > *_hdr functions return network_header which will point to the innermost L3, > > > resulting in the wrong offset being used in __udp{4,6}_lib_lookup with > > > encapsulated packets. > > > > > > To fix this issue p_off param is used in *_gro_complete to pass off the > > > offset of the previous layer. > > > > What exactly does this mean? > > > > This patch changes the definition of gro_complete to add a thoff > > alongside the existing "nhoff".. > > > > > - int (*gro_complete)(struct sk_buff *skb, int nhoff); > > > + int (*gro_complete)(struct sk_buff *skb, int nhoff, > > > + int thoff); > > > > .. but also fixes up implementations to interpret the existing > > argument as a thoff > > > > > -INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff) > > > +INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int nhoff, > > > + int thoff) > > > { > > > - const struct iphdr *iph = ip_hdr(skb); > > > - struct tcphdr *th = tcp_hdr(skb); > > > + const struct iphdr *iph = (const struct iphdr *)(skb->data + nhoff); > > > + struct tcphdr *th = (struct tcphdr *)(skb->data + thoff); > > > > But in some cases the new argument is not nhoff but p_off, e.g., > > > > > static int geneve_gro_complete(struct sock *sk, struct sk_buff *skb, > > > - int nhoff) > > > + int p_off, int nhoff) > > > > Really, the argument is the start of the next header, each callback > > just casts to its expected header (ethhdr, tcphdr, etc.) > > > > The only place where we need to pass an extra argument is in udp, > > because that needs a pointer to the network header right before the > > transport header pointed to by nhoff. > > > > And only due to possible IPv4 options or IPv6 extension headers, we > > cannot just do > > > > + struct udphdr *iph = (struct iphdr *)(skb->data + nhoff - sizeof(*iph)); > > struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); > > > > I also do not immediately see an a way to avoid all the boilerplate > > of a new argument in every callback. Aside from a per_cpu var -- but > > that is excessive. > > > > But it can just be left zero in all callsites, except for > > inet_gro_complete/ipv6_gro_complete, which pass in nhoff. > > Actually, we can avoid the boilerplate changes that add an extra arg. > > By changing the contract between network layer callbacks > (inet_gro_complete/ipv6_gro_complete) and transport layer callbacks > (tcp4_gro_complete et al). Actually, only when calling udp4_gro_complete or udp6_gro_complete. > It's also a bit of a hack. But a lot smaller patch, probably. Feel free to disagree with this approach. Just a suggestion.