On 08/02/2023 13:20, Jiri Olsa wrote: > On Tue, Feb 07, 2023 at 05:14:54PM +0000, Alan Maguire wrote: > > SNIP > >> >> Changes since v2 [2] >> - Arnaldo incorporated some of the suggestions in the v2 thread; >> these patches are based on those; the relevant changes are >> noted as committer changes. >> - Patch 1 is unchanged from v2, but the rest of the patches >> have been updated: >> - Patch 2 separates out the changes to the struct btf_encoder >> that better support later addition of functions. >> - Patch 3 then is changed insofar as these changes are no >> longer needed for the function addition refactoring. >> - Patch 4 has a small change; we need to verify that an >> encoder has actually been added to the encoders list >> prior to removal >> - Patch 5 changed significantly; when attempting to measure >> performance the relatively good numbers attained when using >> delayed function addition were not reproducible. >> Further analysis revealed that the large number of lookups >> caused by the presence of the separate function tree was >> a major cause of performance degradation in the multi >> threaded case. So instead of maintaining a separate tree, >> we use the ELF function list which we already need to look >> up to match ELF -> DWARF function descriptions to store >> the function representation. This has 2 benefits; firstly >> as mentioned, we already look up the ELF function so no >> additional lookup is required to save the function. >> Secondly, the ELF representation is identical for each >> encoder, so we can index the same function across multiple >> encoder function arrays - this greatly speeds up the >> processing of comparing function representations across >> encoders. There is still a performance cost in this > > awesome.. great we can do it without the extra tree > > I wonder we could save some cycles just by memdup-ing the encoder->functions > array for the subsequent encoders, but that's ok for another patch ;-) > great idea; also provides extra assurance the layout of the ELF function arrays are identical! I'd started to explore having ELF info allocated once in main encoder thread and just duped for other threads; should definitely save some time. thanks! Alan > thanks, > jirka >