Re: [PATCH dwarves v2 07/10] btf_encoder: introduce btf_encoding_context

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Tue, 17 Dec 2024 16:03:24 -0800

On Tue, Dec 17, 2024 at 10:06 AM Ihor Solodrai <ihor.solodrai@xxxxx> wrote:
>
> On Monday, December 16th, 2024 at 7:15 PM, Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> >
> > On Mon, 2024-12-16 at 18:39 -0800, Eduard Zingerman wrote:
> >
> > > On Fri, 2024-12-13 at 22:37 +0000, Ihor Solodrai wrote:
> > >
> > > > Introduce a static struct holding global data necessary for BTF
> > > > encoding: elf_functions tables and btf_encoder structs.
> > [...]
> > >
> > > After patch #10 "dwarf_loader: multithreading with a job/worker model"
> > > from this series I do not understand why this patch is necessary.
> > > After patch #10 there is only one BTF encoder, thus:
> > > - there is no need to track btf_encoder_list;
> > > - elf_functions_list can now be a part of the encoder;
> > > - it should be possible to forgo global variable for encoder
> > > and pass it as a parameter for each btf_encoder__* func.
> > >
> > > So it seems that this patch should be dropped and replaced by one that
> > > follows patch #10 and applies the above simplifications.
> > > Wdyt?
> >
> >
> > Meaning that patch #6 "btf_encoder: switch to shared elf_functions table"
> > is not necessary. Strictly speaking, patches 1,2,4 might not be necessary
> > as well, but could be viewed as a refactoring.
> > Switch to single-threaded BTF encoder significantly changes this patch-set.
>
> Eduard, thanks for the review again.
>
> You are correct: if we focus on the multithreading changes in
> dwarf_loader.c and make a decision that there is always a single
> btf_encoder, then much of this series can be discarded.
>
> At the same time I think most of the patches are useful. At the very
> least they enabled experiments that in the end lead me to the
> dwarf_loader changes.
>
> The changes making ELF functions table shared were beneficial in
> isolation, because they eliminated unnecessary duplication of
> information between encoders, leading to reduced memory usage.
>
> The changes splitting ELF and BTF function information in
> btf_encoder.c and simplifying function processing are also good in
> isolation.
>
> In my opinion, it's not wise to discard all of that, because it turned
> out that a single btf_encoder works better in the use-case we care
> about now. Later we might want to revisit parallel BTF encoding. Then
> some version of the refactoring changes here will have to be re-done.
>
> So I think it makes sense to land most of this series without
> significant re-work. But of course I am biased here, as I wrote most
> of the patches, and it's always painful to "throw away" effort.
>
> Let's see what others think.

I agree with Ihor. I think he invested a lot of time into these
improvements, and asking him to re-do the series just to shuffle a few
patches around is just an unnecessary overhead (which also delays the
ultimate outcome: faster BTF generation with pahole). And as Ihor
mentioned, we might improve upon this series by parallelizing encoders
to gain some further improvements, so I think all the internal
refactoring and preparations are setting up a good base for further
work.

Let's try to finalize patch #10's implementation and land this. It's a
nice improvement both performance-wise. It's also good that we don't
need to care about reproducible or not and can effectively deprecate
that short-lived feature we added not so long ago.