On 05/06/2024 12.41, Sebastian Andrzej Siewior wrote:
On 2024-06-05 12:28:08 [+0200], Jesper Dangaard Brouer wrote:
Hmm, but how will this affect performance?
As I wrote in the changelog for v4, I haven't notice a difference. I
tried to move bpf_net_ctx_set() from cpu_map_bpf_prog_run() to
cpu_map_kthread_run() to have this assignment only once and I didn't see
a difference/ I couldn't tell the two kernels apart.
This would be my preferred solution.
See below, your benchmark wasn't testing/measuring this changed code on
remote CPU running kthread.
This is what I have been using for testing
| xdp-bench redirect-cpu --cpu 3 --remote-action drop eth1 -e
in case I was changing the wrong part…
As we saw earlier (with your hardware setup) this test is benchmarking
the RX-NAPI XDP-redirect code. As the cpumap "remote" CPUs kthread had
idle cycles.
The extra clearing bpf_net_ctx_set() for each packet in the kthread on
the remote CPU will not change the benchmark numbers (as it have idle
cycles).
Looking closer at kernel code + your patch, I see that this clearing
isn't done for each packet, but per bulk (up-to CPUMAP_BATCH 8). Given
that, I'm more okay with this change.
--Jesper