On Thu, Aug 19, 2021 at 3:59 PM Kees Cook <keescook@xxxxxxxxxxxx> wrote: > > On Thu, Aug 19, 2021 at 09:57:41AM +0900, Masahiro Yamada wrote: > > When Clang LTO is enabled, additional intermediate files *.lto.o are > > created because LLVM bitcode must be converted to ELF before modpost. > > > > For non-LTO builds: > > > > $(LD) $(LD) > > objects ---> <modname>.o -----> <modname>.ko > > | > > <modname>.mod.o ---/ > > > > For Clang LTO builds: > > > > $(AR) $(LD) $(LD) > > objects ---> <modname>.o ---> <modname>.lto.o -----> <modname>.ko > > | > > <modname>.mod.o --/ > > > > Since the Clang LTO introduction, ugly CONFIG_LTO_CLANG conditionals > > are sprinkled everywhere in the kbuild code. > > > > Another confusion for Clang LTO builds is, <modname>.o is an archive > > that contains LLVM bitcode files. The suffix should have been .a > > instead of .o > > > > To clean up the code, unify the build process of modules, as follows: > > > > $(AR) $(LD) $(LD) > > objects ---> <modname>.a ---> <modname>.prelink.o -----> <modname>.ko > > | > > <modname>.mod.o ------/ > > > > Here, 'objects' are either ELF or LLVM bitcode. <modname>.a is an archive, > > <modname>.prelink.o is ELF. > > I like this design, but I do see that it has a small but measurable > impact on build times: > > allmodconfig build, GCC: > > make -j72 allmodconfig > make -j72 -s clean && time make -j72 > > kbuild/for-next: > 6m16.140s > 6m19.742s > 6m15.848s > > +this-series: > 6m22.742s > 6m20.589s > 6m19.911s > > Thought with not so many modules, it's within the noise: > > defconfig build, GCC: > > make -j72 defconfig > make -j72 -s clean && time make -j72 > > kbuild/for-next: > 0m41.579s > 0m41.214s > 0m41.370s > > +series: > 0m41.423s > 0m41.434s > 0m41.384s > > > However, I do see that even LTO builds are slightly slower now, so > perhaps the above numbers aren't due to the added $(AR) step: > > allmodconfig + Clang ThinLTO: > > make -j72 LLVM=1 LLVM_IAS=1 allmodconfig > ./scripts/config -d GCOV_KERNEL -d KASAN -d LTO_NONE -e LTO_CLANG_THIN > make -j72 LLVM=1 LLVM_IAS=1 olddefconfig > make -j72 -s LLVM=1 LLVM_IAS=1 clean && time make -j72 LLVM=1 LLVM_IAS=1 > > kbuild/for-next: > 9m53.927s > 9m45.874s > 9m47.722s > > +series: > 9m58.395s > 9m53.201s > 9m56.387s I have not tested this closely, but perhaps this might be the cost of $(AR) t $<) In Sami's implementation, *.symversions are merged by shell command. Presumably, it runs faster than llvm-ar. Instead, it has a risk of Argument list too long as reported in [1]. [1] https://lore.kernel.org/lkml/20210614094948.30023-1-lecopzer.chen@xxxxxxxxxxxx/ Anyway, when I find a time, I will look into some bench mark. > > > I haven't been able to isolate where the changes in build times are > coming from (nor have I done link-phase-only timings -- I realize those > are really the most important). > > I did notice some warnings from this patch, though, in the > $(modules-single) target: > > scripts/Makefile.build:434: target 'drivers/scsi/libiscsi.a' given more than once in the same rule > scripts/Makefile.build:434: target 'drivers/atm/suni.a' given more than once in the same rule Ah, right. I also noticed needless rebuilds of prelink.symversions. In v2, I will fix as follows: index 957addea830b..cf6b79dff5f9 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -438,6 +438,8 @@ cmd_merge_symver = \ $(obj)/%.prelink.symversions: $(obj)/%.a FORCE $(call if_changed,merge_symver) +targets += $(patsubst %.a, %.prelink.symversions, $(modules)) + $(obj)/%.prelink.o: ld_flags += --script=$(filter %.symversions,$^) module-symver = $(obj)/%.prelink.symversions diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index f604d2d01cad..5074922db82d 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -107,8 +107,8 @@ real-dtb-y := $(addprefix $(obj)/, $(real-dtb-y)) subdir-ym := $(addprefix $(obj)/,$(subdir-ym)) modules := $(patsubst %.o, %.a, $(obj-m)) -modules-multi := $(patsubst %.o, %.a, $(multi-obj-m)) -modules-single := $(filter-out $(modules-multi), $(filter %.a, $(modules))) +modules-multi := $(sort $(patsubst %.o, %.a, $(multi-obj-m))) +modules-single := $(sort $(filter-out $(modules-multi), $(filter %.a, $(modules)))) # Finds the multi-part object the current object will be linked into. # If the object belongs to two or more multi-part objects, list them all. -- Best Regards Masahiro Yamada