Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase

Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> · Wed, 4 Dec 2024 17:42:50 +0100

From: Jakub Kicinski <kuba@xxxxxxxxxx>
Date: Tue, 3 Dec 2024 16:51:57 -0800

> On Tue, 3 Dec 2024 12:01:16 +0100 Alexander Lobakin wrote:
>>>> @ Jakub,  
>>>
>>> Context? What doesn't work and why?  
>>
>> My tests show the same perf as on Lorenzo's series, but I test with UDP
>> trafficgen. Daniel tests TCP and the results are much worse than with
>> Lorenzo's implementation.
>> I suspect this is related to that how NAPI performs flushes / decides
>> whether to repoll again or exit vs how kthread does that (even though I
>> also try to flush only every 64 frames or when the ring is empty). Or
>> maybe to that part of the kthread happens in process context outside any
>> softirq, while when using NAPI, the whole loop is inside RX softirq.
>>
>> Jesper said that he'd like to see cpumap still using own kthread, so
>> that its priority can be boosted separately from the backlog. That's why
>> we asked you whether it would be fine to have cpumap as threaded NAPI in
>> regards to all this :D
> 
> Certainly not without a clear understanding what the problem with 
> a kthread is.

Yes, sure thing.

Bad thing's that I can't reproduce Daniel's problem >_< Previously, I
was testing with the UDP trafficgen and got up to 80% improvement over
the baseline. Now I tested TCP and got up to 70% improvement, no
regressions whatsoever =\

I don't know where this regression on Daniel's setup comes from. Is it
multi-thread or single-thread test? What app do you use: iperf, netperf,
neper, Microsoft's app (forgot the name)? Do you have multiple NUMA
nodes on your system, are you sure you didn't cross the node when
redirecting with the GRO patches / no other NUMA mismatches happened?
Some other random stuff like RSS hash key, which affects flow steering?

Thanks,
Olek