On Thu, Sep 10, 2020 at 10:18:05AM +0900, Masahiro Yamada wrote: > On Wed, Sep 9, 2020 at 8:46 AM Sami Tolvanen <samitolvanen@xxxxxxxxxx> wrote: > > > > On Sun, Sep 06, 2020 at 09:24:38AM +0900, Masahiro Yamada wrote: > > > On Fri, Sep 4, 2020 at 5:30 AM Sami Tolvanen <samitolvanen@xxxxxxxxxx> wrote: > > > > > > > > This patch series adds support for building x86_64 and arm64 kernels > > > > with Clang's Link Time Optimization (LTO). > > > [...] > > > One more thing, could you teach me > > > how Clang LTO optimizes the code against > > > relocatable objects? > > > > > > When I learned Clang LTO first, I read this document: > > > https://llvm.org/docs/LinkTimeOptimization.html > > > > > > It is easy to confirm the final executable > > > does not contain foo2, foo3... > > > > > > In contrast to userspace programs, > > > kernel modules are basically relocatable objects. > > > > > > Does Clang drop unused symbols from relocatable objects? > > > If so, how? > > > > I don't think the compiler can legally drop global symbols from > > relocatable objects, but it can rename and possibly even drop static > > functions. > > Compilers can drop static functions without LTO. > Rather, it is a compiler warning > (-Wunused-function), so the code should be cleaned up. Right -- I think you're both saying the same thing. Unused static functions can be dropped (modulo a warning) in both regular and LTO builds. > At first, I thought the motivation of LTO > was to remove unused global symbols, and > to perform further optimization. One of LTO's benefits is the performance optimizations, but that's not the driving motivation for it here. The performance optimizations are possible because LTO provides the compiler with a view of the entire built-in portion of the kernel (i.e. not shared objects). That "visible all at once" state is the central concern because CFI (Control Flow Integrity, the driving motivation for this series) needs it in the same way that the performance optimization passes need it. i.e. to gain CFI coverage, LTO is required. Since LTO is a distinct first step independent of CFI, it was split out to be upstreamed while fixes for CFI continued to land independently[1]. Once LTO is landed, CFI comes next. > In contrast, this patch set produces a bigger kernel > because LTO cannot remove any unused symbol. > > So, I do not understand what the benefit is. > > Is inlining beneficial? > I am not sure. This is just a side-effect of LTO. As Sami mentions, it's entirely tunable, and that tuning was chosen based on measurements made for the kernel being built with LTO[2]. > As a whole, I still do not understand > the motivation of this patch set. It is a prerequisite for CFI, and CFI has been protecting *mumble*billion Android device kernels against code-reuse attacks for the last 2ish years[3]. I want this available for the entire Linux ecosystem, not just Android; it is a strong security flaw mitigation technique. I hope that helps explain it! -Kees [1] for example, these are some: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=Control+Flow+Integrity [2] https://lore.kernel.org/lkml/20200624203200.78870-1-samitolvanen@xxxxxxxxxx/T/#m6b576c3af79bdacada10f21651a2b02d33a4e32e [3] https://android-developers.googleblog.com/2018/10/control-flow-integrity-in-android-kernel.html -- Kees Cook