Re: [PATCH v2 dwarves 0/5] btf_encoder: implement shared elf_functions table

Alan Maguire <alan.maguire@xxxxxxxxxx> · Thu, 10 Oct 2024 21:31:01 +0100

On 10/10/2024 01:36, Ihor Solodrai wrote:
> On Wednesday, October 9th, 2024 at 4:43 PM, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
> 
> [...]
>>
>> Do you have the performance / memory usage stats for next vs this patch-set?
> 
> Hi Eduard.
> 
> Yes, I ran perf stat, and looked at max memory as reported by
> `/usr/bin/time -v`. The difference is insignificant compared to
> acmel/dwarves:next (a1241b0) [1]. See below.
> 
> In terms of speed I didn't expect an improvement. It might have 
> even gotten worse due to potential encoder threads synchronization
> when accessing elf_functions table. The table is now built once, but
> before the changes it was built once *per thread*.
> 
> As for memory, no difference is a little surprising as we now have one
> table instead of N (where N is number of threads). But more stuff was
> added to elf_function, so I guess it ate all potential gains.
> 
> 
> Performance counter stats for './pahole -J -j8 --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func,decl_tag_kfuncs --btf_encode_detached=/dev/null --lang_exclude=rust ~/repo/bpf-dev-docker/linux/.tmp_vmlinux1' (31 runs):
> 
> acmel/dwarves:next
> 
>     68,904,383,016      cycles                                                                  ( +-  0.30% )
> 
>             5.5862 +- 0.0304 seconds time elapsed  ( +-  0.54% )
> 
> vs this patchset
> 
>     68,235,717,886      cycles                                                                  ( +-  0.30% )
> 
>             5.5550 +- 0.0412 seconds time elapsed  ( +-  0.74% )
> 
> Memory on acmel/dwarves:next
>        Maximum resident set size (kbytes): 1392640
>        Maximum resident set size (kbytes): 1394600
>        Maximum resident set size (kbytes): 1393788
> 
> Memory on this patchset:
>        Maximum resident set size (kbytes): 1393564
>        Maximum resident set size (kbytes): 1394840
>        Maximum resident set size (kbytes): 1392348
> 
> [1] https://github.com/acmel/dwarves/commit/a1241b095de948becfed882929dda7c4318e022a
> 
> 

Thanks for these stats! In general, I really like the direction; it also
fits neatly with our future plans around encoding additional info like
function addresses; the previous approach of storing all info via name
matching wasn't ideal for that. I'll try it out and review the changes
tomorrow.

One thing I'm curious about; I presume the above stats are for
single-threaded peak memory utilization, right? If that is correct, how
do things scale as we add threads? I'd assume that since we're now
sharing ELF function info, we should see a drop in peak memory
utilization for nthreads > 1 (as compared to the baseline next code)?

Another thing we're encouraging is running the tests; you can do this
via

vmlinux=/path/2/vmlinux ./tests/tests

Not too many there today, but we're working on growing the set of tests.
If you set VERBOSE=1 too you can get a lot of extra info around function
encoding. One thing I think we should be careful about is to ensure we
get a similar number of functions encoded with these changes as compared
to baseline. I don't see any major reason why we wouldn't, but good to
check regardless. Thanks!

Alan