Re: [PATCH dwarves 1/2] dwarf_loader: check .debug_abbrev for cross-cu references

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On April 1, 2021 3:52:05 PM GMT-03:00, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
>On Wed, Mar 31, 2021 at 7:58 PM Yonghong Song <yhs@xxxxxx> wrote:
>>
>> Commit 39227909db3c checked compilation flags to see
>> whether the binary is built with lto or not (-flto).
>> Currently, for clang lto build, default setting
>> won't put compilation flags in dwarf due to size
>> concern.
>>
>> David Blaikie suggested in [1] to scan through .debug_abbrev
>> for DW_FORM_ref_addr which should be most faster than
>> scanning through cu's. This patch implemented this
>> suggestion and replaced the previous compilation flag
>> matching approach. Indeed, it seems that the overhead
>> for this approach is indeed managable.
>>
>> I did some change to measure the overhead of cus_merging_cu():
>>   @@ -2650,7 +2652,15 @@ static int cus__load_module(struct cus *cus,
>struct conf_load *conf,
>>                   }
>>           }
>>
>>   -       if (cus__merging_cu(dw)) {
>>   +       bool do_merging;
>>   +       struct timeval start, end;
>>   +       gettimeofday(&start, NULL);
>>   +       do_merging = cus__merging_cu(dw);
>>   +       gettimeofday(&end, NULL);
>>   +       fprintf(stderr, "%ld %ld -> %ld %ld\n", start.tv_sec,
>start.tv_usec,
>>   +                       end.tv_sec, end.tv_usec);
>>   +
>>   +       if (do_merging) {
>>                   res = cus__merge_and_process_cu(cus, conf, mod, dw,
>elf, filename,
>>                                                   build_id,
>build_id_len,
>>                                                   type_cu ? &type_dcu
>: NULL);
>>
>> For lto vmlinux, the cus__merging_cu() checking takes
>> 130us over total "pahole -J vmlinux" time 65sec as the function bail
>out
>> earlier due to detecting a merging cu condition.
>> For non-lto vmlinux, the cus__merging_cu() checking takes
>> ~171368us over total pahole time 36sec, roughly 0.5% overhead.
>>
>>  [1] https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@xxxxxx/
>>
>
>It might be a nice little touch to add:
>
>Suggested-by: David Blaikie <blaikie@xxxxxxxxxx>

Sure, this is something that is becoming the norm, be it from patch submitters, or, that being somehow lost, by the maintainer.

I think this is not just fair, but documents what actually happened and encourage people to share ideas more freely and quickly.

I'll do it in this specific case.

if I failed to do so I'm the past, I'm sorry.

- Arnaldo
>
>> Signed-off-by: Yonghong Song <yhs@xxxxxx>
>> ---
>>  dwarf_loader.c | 43 ++++++++++++++++++++++++-------------------
>>  1 file changed, 24 insertions(+), 19 deletions(-)
>>
>> diff --git a/dwarf_loader.c b/dwarf_loader.c
>> index c1ca1a3..bd23751 100644
>> --- a/dwarf_loader.c
>> +++ b/dwarf_loader.c
>> @@ -2503,35 +2503,40 @@ static int cus__load_debug_types(struct cus
>*cus, struct conf_load *conf,
>>
>>  static bool cus__merging_cu(Dwarf *dw)
>>  {
>> -       uint8_t pointer_size, offset_size;
>>         Dwarf_Off off = 0, noff;
>>         size_t cuhl;
>> -       int cnt = 0;
>>
>> -       /*
>> -        * Just checking the first cu is not enough.
>> -        * In linux, some C files may have LTO is disabled, e.g.,
>> -        *   e242db40be27  x86, vdso: disable LTO only for vDSO
>> -        *   d2dcd3e37475  x86, cpu: disable LTO for cpu.c
>> -        * Fortunately, disabling LTO for a particular file in a LTO
>build
>> -        * is rather an exception. Iterating 5 cu's to check whether
>> -        * LTO is used or not should be enough.
>> -        */
>> -       while (dwarf_nextcu(dw, off, &noff, &cuhl, NULL,
>&pointer_size,
>> -                           &offset_size) == 0) {
>> +       while (dwarf_nextcu (dw, off, &noff, &cuhl, NULL, NULL, NULL)
>== 0) {
>>                 Dwarf_Die die_mem;
>>                 Dwarf_Die *cu_die = dwarf_offdie(dw, off + cuhl,
>&die_mem);
>>
>>                 if (cu_die == NULL)
>>                         break;
>>
>> -               if (++cnt > 5)
>> -                       break;
>> +               Dwarf_Off offset = 0;
>> +               while (true) {
>> +                       size_t length;
>> +                       Dwarf_Abbrev *abbrev = dwarf_getabbrev
>(cu_die, offset, &length);
>> +                       if (abbrev == NULL || abbrev ==
>DWARF_END_ABBREV)
>> +                               break;
>>
>> -               const char *producer = attr_string(cu_die,
>DW_AT_producer);
>> -               if (strstr(producer, "clang version") != NULL &&
>> -                   strstr(producer, "-flto") != NULL)
>> -                       return true;
>> +                       size_t attrcnt;
>> +                       if (dwarf_getattrcnt (abbrev, &attrcnt) != 0)
>> +                               return false;
>> +
>> +                       unsigned int attr_num, attr_form;
>> +                       Dwarf_Off aboffset;
>> +                       size_t j;
>> +                       for (j = 0; j < attrcnt; ++j) {
>> +                               if (dwarf_getabbrevattr (abbrev, j,
>&attr_num, &attr_form,
>> +                                                        &aboffset))
>> +                                       return false;
>> +                               if (attr_form == DW_FORM_ref_addr)
>> +                                       return true;
>> +                       }
>> +
>> +                       offset += length;
>> +               }
>>
>>                 off = noff;
>>         }
>> --
>> 2.30.2
>>

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux