Hi Richard, On Thu, May 9, 2024 at 9:09 PM Richard Gobert <richardbgobert@xxxxxxxxx> wrote: > {inet,ipv6}_gro_receive functions perform flush checks (ttl, flags, > iph->id, ...) against all packets in a loop. These flush checks are used in > all merging UDP and TCP flows. > > These checks need to be done only once and only against the found p skb, > since they only affect flush and not same_flow. > > This patch leverages correct network header offsets from the cb for both > outer and inner network headers - allowing these checks to be done only > once, in tcp_gro_receive and udp_gro_receive_segment. As a result, > NAPI_GRO_CB(p)->flush is not used at all. In addition, flush_id checks are > more declarative and contained in inet_gro_flush, thus removing the need > for flush_id in napi_gro_cb. > > This results in less parsing code for non-loop flush tests for TCP and UDP > flows. > > To make sure results are not within noise range - I've made netfilter drop > all TCP packets, and measured CPU performance in GRO (in this case GRO is > responsible for about 50% of the CPU utilization). > > perf top while replaying 64 parallel IP/TCP streams merging in GRO: > (gro_receive_network_flush is compiled inline to tcp_gro_receive) > net-next: > 6.94% [kernel] [k] inet_gro_receive > 3.02% [kernel] [k] tcp_gro_receive > > patch applied: > 4.27% [kernel] [k] tcp_gro_receive > 4.22% [kernel] [k] inet_gro_receive > > perf top while replaying 64 parallel IP/IP/TCP streams merging in GRO (same > results for any encapsulation, in this case inet_gro_receive is top > offender in net-next) > net-next: > 10.09% [kernel] [k] inet_gro_receive > 2.08% [kernel] [k] tcp_gro_receive > > patch applied: > 6.97% [kernel] [k] inet_gro_receive > 3.68% [kernel] [k] tcp_gro_receive > > Signed-off-by: Richard Gobert <richardbgobert@xxxxxxxxx> Thanks for your patch, which is now commit 4b0ebbca3e167976 ("net: gro: move L3 flush checks to tcp_gro_receive and udp_gro_receive_segment") in net-next/main (next-20240514). noreply@xxxxxxxxxxxxxx reports build failures on m68k, e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/15168903/ net/core/gro.c: In function ‘dev_gro_receive’: ././include/linux/compiler_types.h:460:38: error: call to ‘__compiletime_assert_654’ declared with attribute error: BUILD_BUG_ON failed: !IS_ALIGNED(offsetof(struct napi_gro_cb, zeroed), sizeof(u32)) > --- a/include/net/gro.h > +++ b/include/net/gro.h > @@ -36,15 +36,15 @@ struct napi_gro_cb { > /* This is non-zero if the packet cannot be merged with the new skb. */ > u16 flush; > > - /* Save the IP ID here and check when we get to the transport layer */ > - u16 flush_id; > - > /* Number of segments aggregated. */ > u16 count; > > /* Used in ipv6_gro_receive() and foo-over-udp and esp-in-udp */ > u16 proto; On most architectures, there is now a hole of 2 bytes here. However, on m68k the minimum alignment of __wsum (__u32) below is 2, hence there is no hole, breaking the assertion. Probably you just want to make this explicit, by adding u16 pad; here. > > + /* used to support CHECKSUM_COMPLETE for tunneling protocols */ > + __wsum csum; Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds