Willem de Bruijn wrote: > Richard Gobert wrote: > > Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp: > > additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the > > complete phase of gro. The functions always return skb->network_header, > > which in the case of encapsulated packets at the gro complete phase, is > > always set to the innermost L3 of the packet. That means that calling > > {ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in > > gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_ > > L3/L4 may return an unexpected value. > > > > This incorrect usage leads to a bug in GRO's UDP socket lookup. > > udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These > > *_hdr functions return network_header which will point to the innermost L3, > > resulting in the wrong offset being used in __udp{4,6}_lib_lookup with > > encapsulated packets. > > > > To fix this issue p_off param is used in *_gro_complete to pass off the > > offset of the previous layer. > > What exactly does this mean? > > This patch changes the definition of gro_complete to add a thoff > alongside the existing "nhoff".. > > > - int (*gro_complete)(struct sk_buff *skb, int nhoff); > > + int (*gro_complete)(struct sk_buff *skb, int nhoff, > > + int thoff); > > .. but also fixes up implementations to interpret the existing > argument as a thoff > > > -INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int thoff) > > +INDIRECT_CALLABLE_SCOPE int tcp4_gro_complete(struct sk_buff *skb, int nhoff, > > + int thoff) > > { > > - const struct iphdr *iph = ip_hdr(skb); > > - struct tcphdr *th = tcp_hdr(skb); > > + const struct iphdr *iph = (const struct iphdr *)(skb->data + nhoff); > > + struct tcphdr *th = (struct tcphdr *)(skb->data + thoff); > > But in some cases the new argument is not nhoff but p_off, e.g., > > > static int geneve_gro_complete(struct sock *sk, struct sk_buff *skb, > > - int nhoff) > > + int p_off, int nhoff) > > Really, the argument is the start of the next header, each callback > just casts to its expected header (ethhdr, tcphdr, etc.) > > The only place where we need to pass an extra argument is in udp, > because that needs a pointer to the network header right before the > transport header pointed to by nhoff. > > And only due to possible IPv4 options or IPv6 extension headers, we > cannot just do > > + struct udphdr *iph = (struct iphdr *)(skb->data + nhoff - sizeof(*iph)); > struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); > > I also do not immediately see an a way to avoid all the boilerplate > of a new argument in every callback. Aside from a per_cpu var -- but > that is excessive. > > But it can just be left zero in all callsites, except for > inet_gro_complete/ipv6_gro_complete, which pass in nhoff. Actually, we can avoid the boilerplate changes that add an extra arg. By changing the contract between network layer callbacks (inet_gro_complete/ipv6_gro_complete) and transport layer callbacks (tcp4_gro_complete et al). If the first pass their own unmodified offset, nhoff: err = INDIRECT_CALL_2(ops->callbacks.gro_complete, tcp4_gro_complete, udp4_gro_complete, - skb, nhoff + sizeof(*iph)); + skb, nhoff); And the latter parse the network header for total_len/payload_len, to find their original offset. It's also a bit of a hack. But a lot smaller patch, probably.