On Tue, Mar 30, 2021 at 8:13 PM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 3/30/21 7:51 PM, David Blaikie wrote: > > On Tue, Mar 30, 2021 at 7:39 PM Fāng-ruì Sòng <maskray@xxxxxxxxxx> wrote: > >> > >> On Tue, Mar 30, 2021 at 6:48 PM Yonghong Song <yhs@xxxxxx> wrote: > >>> > >>> > >>> > >>> On 3/30/21 5:25 PM, Fangrui Song wrote: > >>>> On 2021-03-30, 'Yonghong Song' via Clang Built Linux wrote: > >>>>> > >>>>> > >>>>> On 3/29/21 3:52 PM, Nick Desaulniers wrote: > >>>>>> (replying to > >>>>>> https://lore.kernel.org/bpf/20210328064121.2062927-1-yhs@xxxxxx/) > >>>>>> > >>>>>> Thanks for the patch! > >>>>>> > >>>>>>> +# gcc emits compilation flags in dwarf DW_AT_producer by default > >>>>>>> +# while clang needs explicit flag. Add this flag explicitly. > >>>>>>> +ifdef CONFIG_CC_IS_CLANG > >>>>>>> +DEBUG_CFLAGS += -grecord-gcc-switches > >>>>>>> +endif > >>>>>>> + > >>>> > >>>> Yes, gcc defaults to -grecord-gcc-switches. Clang doesn't. > >>> > >>> Could you know why? dwarf size concern? > >>> > >>>> > >>>>>> This adds ~5MB/1% to vmlinux of an x86_64 defconfig built with clang. > >>>>>> Do we > >>>>>> want to add additional guards for CONFIG_DEBUG_INFO_BTF, so that we > >>>>>> don't have > >>>>>> to pay that cost if that config is not set? > >>>>> > >>>>> Since this patch is mostly motivated to detect whether the kernel is > >>>>> built with clang lto or not. Let me add the flag only if lto is > >>>>> enabled. My measurement shows 0.5% increase to thinlto-vmlinux. > >>>>> The smaller percentage is due to larger .debug_info section > >>>>> (almost double) for thinlto vs. no lto. > >>>>> > >>>>> ifdef CONFIG_LTO_CLANG > >>>>> DEBUG_CFLAGS += -grecord-gcc-switches > >>>>> endif > >>>>> > >>>>> This will make pahole with any clang built kernels, lto or non-lto. > >>>> > >>>> I share the same concern about sizes. Can't pahole know it is clang LTO > >>>> via other means? If pahole just needs to know the one-bit information > >>>> (clang LTO vs not), having every compile option seems unnecessary.... > >>> > >>> This is v2 of the patch > >>> https://lore.kernel.org/bpf/20210331001623.2778934-1-yhs@xxxxxx/ > >>> The flag will be guarded with CONFIG_LTO_CLANG. > >>> > >>> As mentioned in commit message of v2, the alternative is > >>> to go through every cu to find out whether DW_FORM_ref_addr is used > >>> or not. In other words, check every possible cross-cu references > >>> to find whether cross-cu reference actually happens or not. This > >>> is quite heavy for pahole... > >>> > >>> What we really want to know is whether cross-cu reference happens > >>> or not? If there is an easy way to get it, that will be great. > >> > >> +David Blaikie > > > > Yep, that shouldn't be too hard to test for more directly - scanning > > .debug_abbrev for DW_FORM_ref_addr should be what you need. Would that > > be workable rather than relying on detecting clang/lto from command > > line parameters? (GCC can produce these cross-CU references too, when > > using lto - so this approach would help make the solution generalize > > over GCC's behavior too) > > Thanks, David. This should be better. I tried with a non-lto vmlinux. > Did "llvm-dwarfdump --debug-abbrev vmlinux > log" and then > "grep "DW_CHILDREN_no" log | wc -l" and get 231676 records. What conclusions are you drawing from this number/data? (I'm not following how DW_CHILDREN_no relates to the topic - perhaps I'm missing something) > I will try this approach. If the time is a very small fraction of > actual dwarf cu processing time, we should be fine. This definitely > better than visit all die's in cu trying to detect cross-cu reference. *fingers crossed*