Re: [PATCH v2 dwarves 0/5] btf_encoder: implement shared elf_functions table

Ihor Solodrai <ihor.solodrai@xxxxx> · Fri, 11 Oct 2024 16:52:00 +0000

On Friday, October 11th, 2024 at 9:27 AM, Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:

[...]

> > Thanks for these stats! In general, I really like the direction; it also
> > fits neatly with our future plans around encoding additional info like
> > function addresses; the previous approach of storing all info via name
> > matching wasn't ideal for that. I'll try it out and review the changes
> > tomorrow.
> > 
> > One thing I'm curious about; I presume the above stats are for
> > single-threaded peak memory utilization, right? If that is correct, how
> > do things scale as we add threads? I'd assume that since we're now
> > sharing ELF function info, we should see a drop in peak memory
> > utilization for nthreads > 1 (as compared to the baseline next code)?
> 
> 
> Did some experiments here, saw no significant difference in terms of
> peak memory utilization; this may indicate the peak memory utilization
> is a function of something else. Comparing peak memory utilization
> between 1 and 8-threaded encoding, a similar pattern is observed for
> baseline next and this series, for baseline 1 vs 8 threads we see:
> 
> < Maximum resident set size (kbytes): 1069304
> ---
> 
> > Maximum resident set size (kbytes): 1119412
> 
> 
> ...while for this series 1 vs 8 threads we see
> 
> < Maximum resident set size (kbytes): 1071148
> ---
> 
> > Maximum resident set size (kbytes): 1125052
> 
> 
> So pretty similar really. Maybe a system with a larger number of
> processors would reveal something more here and show the benefits of
> shared ELF representations.
> 
> > Another thing we're encouraging is running the tests; you can do this
> > via
> > 
> > vmlinux=/path/2/vmlinux ./tests/tests
> > 
> > Not too many there today, but we're working on growing the set of tests.
> > If you set VERBOSE=1 too you can get a lot of extra info around function
> > encoding. One thing I think we should be careful about is to ensure we
> > get a similar number of functions encoded with these changes as compared
> > to baseline. I don't see any major reason why we wouldn't, but good to
> > check regardless. Thanks!
> 
> 
> I tried this too, and compared verbose output of baseline and test
> btf_functions; both were identical:
> 
> $ VERBOSE=1 vmlinux=/home/almagui/kbuild/bpf-next/vmlinux bash
> btf_functions.sh > /var/tmp/btf_functions.baseline
> 
> $ VERBOSE=1 vmlinux=/home/almagui/kbuild/bpf-next/vmlinux bash
> btf_functions.sh > /var/tmp/btf_functions.test
> 
> $ diff /var/tmp/btf_functions.baseline /var/tmp/btf_functions.test
> $
> 
> This suggests the results are the same since we encode the same number
> of functions, refuse to encode the same number of inconsistent functions
> etc. Which is great!

Hi Alan. Thank you for testing!

I was going to run a couple more experiments today and respond, but
you beat me to it.

I am curious about the memory usage too. I'll try measuring how much
is used by btf_encoders and the elf_functions specifically, similar to
how I found the table was big before the function representation
changes [1]. I'll share if I find anything interesting.

[1] https://lore.kernel.org/dwarves/aQtEzThpIhRiwQpkgeZepPGr5Mt_zuGfPbxQxZgS6SSoPYoqM1afjDqEmIZkdH3YzvhWmWwqCS_8ZvFTcSZHvpkAeBpLRTAZEmrOhq0svfo=@pm.me/

> 
> Alan