On Wed, Mar 24, 2021 at 11:53 PM Yonghong Song <yhs@xxxxxx> wrote: > > This patch added an option "merge_cus", which will permit > to merge all debug info cu's into one pahole cu. > For vmlinux built with clang thin-lto or lto, there exist > cross cu type references. For example, you could have > compile unit 1: > tag 10: type A > compile unit 2: > ... > refer to type A (tag 10 in compile unit 1) > I only checked a few but have seen type A may be a simple type > like "unsigned char" or a complex type like an array of base types. > > There are two different ways to resolve this issue: > (1). merge all compile units as one pahole cu so tags/types > can be resolved easily, or > (2). try to do on-demand type traversal in other debuginfo cu's > when we do die_process(). > The method (2) is much more complicated so I picked method (1). > An option "merge_cus" is added to permit such an operation. > > Merging cu's will create a single cu with lots of types, tags > and functions. For example with clang thin-lto built vmlinux, > I saw 9M entries in types table, 5.2M in tags table. The > below are pahole wallclock time for different hashbits: > command line: time pahole -J --merge_cus vmlinux > # of hashbits wallclock time in seconds > 15 460 > 16 255 > 17 131 > 18 97 > 19 75 > 20 69 > 21 64 > 22 62 > 23 58 > 24 64 What were the numbers for different hashbits without --merge_cus? > > Note that the number of hashbits 24 makes performance worse > than 23. The reason could be that 23 hashbits can cover 8M > buckets (close to 9M for the number of entries in types table). > Higher number of hash bits allocates more memory and becomes > less cache efficient compared to 23 hashbits. > > This patch picks # of hashbits 21 as the starting value > and will try to allocate memory based on that, if memory > allocation fails, we will go with less hashbits until > we reach hashbits 15 which is the default for > non merge-cu case. > > Signed-off-by: Yonghong Song <yhs@xxxxxx> > --- > dwarf_loader.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++ > dwarves.h | 2 ++ > pahole.c | 8 +++++ > 3 files changed, 100 insertions(+) > [...]