Re: [PATCH] bpf: Separate bpf_local_storage_lookup() fast and slow paths

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/5/24 7:00 AM, Marco Elver wrote:
On Wed, 31 Jan 2024 at 20:52, Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
[...]
| num_maps: 1000
|  local_storage cache sequential  get:
|                              <before>                | <after>
|   hits throughput:           0.357 ± 0.005 M ops/s   | 0.325 ± 0.005 M ops/s        (-9.0%)
|   hits latency:              2803.738 ns/op          | 3076.923 ns/op               (+9.7%)

Is it understood why the slow down here? The same goes for the "num_maps: 32"
case above but not as bad as here.

It turned out that there's a real slowdown due to the outlined
slowpath. If I inline everything except for inserting the entry into
the cache (cacheit_lockit codepath is still outlined), the results
look much better even for the case where it always misses the cache.

[...]
diff --git a/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c b/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c
index a043d8fefdac..9895087a9235 100644
--- a/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c
+++ b/tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c
@@ -21,7 +21,7 @@ struct {
       __type(value, long);
   } map_b SEC(".maps");

-SEC("fentry/bpf_local_storage_lookup")
+SEC("fentry/bpf_local_storage_lookup_slowpath")

The selftest is trying to catch recursion. The change here cannot test the same
thing because the slowpath will never be hit in the test_progs.  I don't have a
better idea for now also.

Trying to prepare a v2, and for the test, the only option I see is to
introduce a tracepoint ("bpf_local_storage_lookup"). If unused, should
be a no-op due to static branch.

Or can you suggest different functions to hook to for the recursion test?

I don't prefer to add another tracepoint for the selftest.

The test in "SEC("fentry/bpf_local_storage_lookup")" is testing that the initial bpf_local_storage_lookup() should work and the immediate recurred bpf_task_storage_delete() will fail.

Depends on how the new slow path function will look like in v2. The test can probably be made to go through the slow path, e.g. by creating a lot of task storage maps before triggering the lookup.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux