On Saturday, August 6, 2016 2:17:16 PM CEST Nicholas Piggin wrote: > On Fri, 05 Aug 2016 21:16:00 +0200 > Arnd Bergmann <arnd@xxxxxxxx> wrote: > > > On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote: > > > > > > > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h > > > > index 0ec807d69f18..7a3ad269fa23 100644 > > > > --- a/include/asm-generic/vmlinux.lds.h > > > > +++ b/include/asm-generic/vmlinux.lds.h > > > > @@ -433,7 +433,7 @@ > > > > * during second ld run in second ld pass when generating System.map */ > > > > #define TEXT_TEXT \ > > > > ALIGN_FUNCTION(); \ > > > > - *(.text.hot .text .text.fixup .text.unlikely) \ > > > > + *(.text.hot .text .text.* .text.fixup .text.unlikely) \ > > > > *(.ref.text) \ > > > > MEM_KEEP(init.text) \ > > > > MEM_KEEP(exit.text) \ > > > > > > > > > > > > It also got much faster again, the link time for an allyesconfig > > > > kernel is now 18 minutes instead of 10 hours, but it's still > > > > much worse than the 2 minutes I had earlier or the four minutes > > > > with the previous patch. > > > > > > Are you using the patches I just sent? > > > > Not yet, I was still busy with the older version, and trying to > > figure out exactly what went wrong in ld.bfd. FWIW, I first tried > > to see if the hash tables were just too small, but as it turned > > out that was not the problem. When I tried to change the default > > hash table sizes, making them bigger only made things slower. > > > > I also found the --hash-size=xxx option, which has a significant > > impact on runtime speed. Interestingly again, using sizes less > > than the default made things faster in practice. If we can > > work out the optimum size for the kernel build, that might > > shave a few minutes off the total build time. > > > > > Either way, you also need > > > to do the same for data and bss sections as you are using > > > -fdata-sections too. > > > > Right. > > > > > I've found virtually no build time regression on powerpc or x86 > > > when those are taken care of properly (x86 numbers I sent are typo, > > > it's not 5m20, it's 5m02). > > > > Interesting. I wonder if it's got something to do with the > > generation of the branch trampolines on ARM, as we have a lot > > of them on an allyesconfig. > > Powerpc generates quite a few branch trampolines as well, so > I'm not sure if that would be the issue. Can you get a profile > of the link? CPU: AMD64 family15h, speed 2600 MHz (estimated) Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name symbol name 1212556 63.6990 ld-new bfd_hash_lookup 416050 21.8563 ld-new bfd_hash_hash 64861 3.4073 no-vmlinux /no-vmlinux 59038 3.1014 ld-new bfd_hash_traverse 13873 0.7288 ld-new bfd_get_next_section_by_name 9880 0.5190 ld-new strrevcmp I've manually marked bfd_hash_hash as __attribute__((noinline)) to see it separately from bfd_hash_lookup. The vast majority of these calls seem to come from _bfd_elf_strtab_add and from bfd_get_section_by_name/bfd_get_next_section_by_name. While I first thought the hash tables were too slow, investigating further showed that most of the hash tables are really small (and appropriately sized), we just do a lot of lookups on them. > Are you linking with archives? Do your input archives have a > symbol index built? yes, and don't know. I've moved on to your new patches now, will see how that goes. > > Is the 5m20 the total build time for the kernel, the time for > > rebuilding after a trivial change, or the time to call 'ld.bfd' > > once? > > 5m02 was the total time for x86 defconfig. With the powerpc > allyesconfig build, the final link: > > $ time ld -EL -m elf64lppc -pie --emit-relocs --build-id --gc-sections -X -o vmlinux -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive built-in.o .tmp_kallsyms2.o > > real 0m15.556s > user 0m13.288s > sys 0m2.240s > > $ ls -lh vmlinux > -rwxrwxr-x 1 npiggin npiggin 279M Aug 6 14:02 vmlinux > > Without -pie --emit-relocs it's 11.8s and 150M but I'm using > emit-relocs for a post-link step. Interesting, that does sound more like an ARM specific bug in ld then. > > Are you using ld.bfd on x86 or ld.gold? For me ld.gold either > > works and is really fast, or it crashes, depending on the > > configuration. I also don't think it supports big-endian ARM > > (which is what allyesconfig ends up using). > > ld.bfd on both. Gold crashed on powerpc and I didn't try it on x86. Ok. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-next" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html