Re: [PATCH net-next v4 4/4] net: gro: move L3 flush checks to tcp_gro_receive

Richard Gobert <richardbgobert@xxxxxxxxx> · Tue, 26 Mar 2024 18:25:02 +0100

Paolo Abeni wrote:
> Hi,
> 
> On Tue, 2024-03-26 at 16:02 +0100, Richard Gobert wrote:
>> This patch is meaningful by itself - removing checks against non-relevant
>> packets and making the flush/flush_id checks in a single place.
> 
> I'm personally not sure this patch is a win. The code churn is
> significant. I understand this is for performance's sake, but I don't
> see the benefit??? 
> 

Could you clarify what do you mean by code churn?

This patch removes all use of p->flush and flush_id from the
CB. The entire logic for L3 flush_id is scattered in tcp_gro_receive
and {inet,ipv6}_gro_receive with conditionals rewriting ->flush,
->flush_id and ->is_atomic. Moving it to one place (gro_network_flush)
should be more readable. (Personally, it took me a lot of time to
understand the current logic of flush + flush_id + is_atomic)

> The changelog shows that perf reports slightly lower figures for
> inet_gro_receive(). That is expected, as this patch move code out of
> such functio. What about inet_gro_flush()/tcp_gro_receive() where such
> code is moved?
> 

Please consider the following 2 common scenarios:

1) Multiple packets in the GRO bucket - the common case with multiple
   packets in the bucket (i.e. running super_netperf TCP_STREAM) - each layer
   executes a for loop - going over each packet in the bucket. Specifically,
   L3 gro_receive loops over the bucket making flush,flush_id,is_atomic
   checks. For most packets in the bucket, these checks are not
   relevant. (possibly also dirtying cache lines with non-relevant p
   packets). Removing code in the for loop for this case is significant.

2) UDP/TCP streams which do not coalesce in GRO. This is the common case
   for regular UDP connections (i.e. running netperf UDP_STREAM). In this
   case, GRO is just overhead. Removing any code from these layers
   is good (shown in the first measurement of the commit message).

In the case of a single TCP connection - the amount of checks should be
the same overall not causing any noticeable difference.

> Additionally the reported deltas is within noise level according to my
> personal experience with similar tests.
> 

I've tested the difference between net-next and this patch repetitively,
which showed stable results each time. Is there any specific test you
think would be helpful to show the result?

Thanks