On Wed, Sep 9, 2020 at 8:46 AM Sami Tolvanen <samitolvanen@xxxxxxxxxx> wrote: > > On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote: > > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@xxxxxxxxxx> wrote: > > > > > > This patch series adds support for building x86_64 and arm64 kernels > > > with Clang's Link Time Optimization (LTO). > > > > > > In addition to performance, the primary motivation for LTO is > > > to allow Clang's Control-Flow Integrity (CFI) to be used in the > > > kernel. Google has shipped millions of Pixel devices running three > > > major kernel versions with LTO+CFI since 2018. > > > > > > Most of the patches are build system changes for handling LLVM > > > bitcode, which Clang produces with LTO instead of ELF object files, > > > postponing ELF processing until a later stage, and ensuring initcall > > > ordering. > > > > > > Note that patches 1-4 are not directly related to LTO, but are > > > needed to compile LTO kernels with ToT Clang, so I'm including them > > > in the series for your convenience: > > > > > > - Patches 1-3 are required for building the kernel with ToT Clang, > > > and IAS, and patch 4 is needed to build allmodconfig with LTO. > > > > > > - Patches 3-4 are already in linux-next, but not yet in 5.9-rc. > > > > > > > > > I still do not understand how this patch set works. > > (only me?) > > > > Please let me ask fundamental questions. > > > > > > > > I applied this series on top of Linus' tree, > > and compiled for ARCH=arm64. > > > > I compared the kernel size with/without LTO. > > > > > > > > [1] No LTO (arm64 defconfig, CONFIG_LTO_NONE) > > > > $ llvm-size vmlinux > > text data bss dec hex filename > > 15848692 10099449 493060 26441201 19375f1 vmlinux > > > > > > > > [2] Clang LTO (arm64 defconfig + CONFIG_LTO_CLANG) > > > > $ llvm-size vmlinux > > text data bss dec hex filename > > 15906864 10197445 490804 26595113 195cf29 vmlinux > > > > > > I compared the size of raw binary, arch/arm64/boot/Image. > > Its size increased too. > > > > > > > > So, in my experiment, enabling CONFIG_LTO_CLANG > > increases the kernel size. > > Is this correct? > > Yes. LTO does produce larger binaries, mostly due to function > inlining between translation units, I believe. The compiler people > can probably give you a more detailed answer here. Without -mllvm > -import-instr-limit, the binaries would be even larger. > > > One more thing, could you teach me > > how Clang LTO optimizes the code against > > relocatable objects? > > > > > > > > When I learned Clang LTO first, I read this document: > > https://llvm.org/docs/LinkTimeOptimization.html > > > > It is easy to confirm the final executable > > does not contain foo2, foo3... > > > > > > > > In contrast to userspace programs, > > kernel modules are basically relocatable objects. > > > > Does Clang drop unused symbols from relocatable objects? > > If so, how? > > I don't think the compiler can legally drop global symbols from > relocatable objects, but it can rename and possibly even drop static > functions. Compilers can drop static functions without LTO. Rather, it is a compiler warning (-Wunused-function), so the code should be cleaned up. > This is why we need global wrappers for initcalls, for > example, to have stable symbol names. > > Sami At first, I thought the motivation of LTO was to remove unused global symbols, and to perform further optimization. It is true for userspace programs. In fact, the example of https://llvm.org/docs/LinkTimeOptimization.html produces a smaller binary. In contrast, this patch set produces a bigger kernel because LTO cannot remove any unused symbol. So, I do not understand what the benefit is. Is inlining beneficial? I am not sure. Documentation/process/coding-style.rst "15) The inline disease" mentions that inlining is not always a good thing. As a whole, I still do not understand the motivation of this patch set. -- Best Regards Masahiro Yamada