Re: [RFC PATCH 0/4] x86/build: Get rid of vmlinux postlink step

Ard Biesheuvel <ardb@xxxxxxxxxx> · Mon, 24 Feb 2025 22:25:21 +0100

On Mon, 24 Feb 2025 at 21:00, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, 24 Feb 2025 at 10:51, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> > But in terms of justification for upstreaming, the reduction in
> > complexity alone makes it worth it IMO:
> >
> >   19 files changed, 52 insertions(+), 87 deletions(-)
>
> Yeah, absolutely. Our fancy make build rules still have too many of
> the phony forced targets, but this is a few less of them and makes the
> build confirm (more) to the usual rules.
>
> I do wonder if we could just get rid of that
> CONFIG_ARCH_VMLINUX_NEEDS_RELOCS entirely and make it just be how all
> architectures do it.
>
> Yes, it was apparently "just" riscv/s390/x86/mips that did that
> 'strip_relocs' hack, but at the same time that whole pass *feels*
> entirely generic.
>

TL;DR it is not

It is only needed on architectures that use --emit-relocs in the first
place, e.g., to construct bespoke KASLR tables. This is actually a
somewhat dubious practice, because these are static relocations, i.e.,
what the linker consumes as input, and they are emitted along with
vmlinux as output. [*] This feature was (AFAIK) never really intended
for constructing dynamic relocation tables as some architectures in
Linux do.

On those architectures, these static relocations need to be stripped
again, to avoid bloating vmlinux with useless data.

On architectures that rely on PIE linking (such as arm64), the linker
will emit a dynamic relocation table that is more suitable for use at
boot time, i.e., it only contains absolute relocations (as
RIP-relative ones never require any fixing up at boot), and uses RELR
format to pack them very densely, removing the need for our own
special format.

Architectures that do not implement KASLR in the first place have no
need for these static relocations either.

PIE linking is generally a better choice than relying on
--emit-relocs, but it is highly ISA dependent whether that requires a
costlier kind of codegen. On arm64, we don't even bother generating
-fPIE code because ordinary code can be linked in PIE mode. OTOH, on
x86, we'd need full 64-bit PIC/PIE codegen in order to link with PIE,
whereas we currently rely on the 'kernel' code model to generate
32-bit wide absolute symbol references that can only refer to the top
2G of the 64-bit address space.

[*] Using static relocations to describe a fully linked binary such as
vmlinux is problematic because a) it covers all external symbol
references, including relative ones that don't need fixing up, but
more importantly, b) the linker may perform relaxations that result in
the code going out of sync with the relocation that annotates it (this
is not entirely avoidable if the relaxed version of the code cannot
even be described by any relocation specified by the ELF psABI)