Hi Peter, Thanks for taking a look. On Tue, 1 Oct 2024 at 23:13, H. Peter Anvin <hpa@xxxxxxxxx> wrote: > > On 9/25/24 08:01, Ard Biesheuvel wrote: > > From: Ard Biesheuvel <ardb@xxxxxxxxxx> > > > > As an intermediate step towards enabling PIE linking for the 64-bit x86 > > kernel, enable PIE codegen for all objects that are linked into the > > kernel proper. > > > > This substantially reduces the number of relocations that need to be > > processed when booting a relocatable KASLR kernel. > > > > This really seems like going completely backwards to me. > > You are imposing a more restrictive code model on the kernel, optimizing > for boot time in a way that will exert a permanent cost on the running > kernel. > Fair point about the boot time. This is not the only concern, though, and arguably the least important one. As I responded to Andi before, it is also about using a code model and relocation model that matches the reality of how the code is executed: - the early C code runs from the 1:1 mapping, and needs special hacks to accommodate this - KASLR runs the kernel from a different virtual address than the one we told the linker about > There is a *huge* difference between the kernel and user space here: > > KERNEL MEMORY IS PERMANENTLY ALLOCATED, AND IS NEVER SHARED. > No need to shout. > Dirtying user pages requires them to be unshared and dirty, which is > undesirable. Kernel pages are *always* unshared and dirty. > I guess you are referring to the use of a GOT? That is a valid concern, but it does not apply here. With hidden visibility and compiler command line options like -mdirect-access-extern, all emitted symbol references are direct. Disallowing text relocations could be trivially enabled with this series if desired, and actually helps avoid the tricky bugs we keep fixing in the early startup code that executes from the 1:1 mapping (the C code in .head.text) So it mostly comes down to minor differences in addressing modes, e.g., movq $sym, %reg actually uses more bytes than leaq sym(%rip), %reg whereas movq sym, %reg and movq sym(%rip), %reg are the same length. OTOH, indexing a statically allocated global array like movl array(,%reg1,4), %reg2 will be converted into leaq array(%rip), %reg2 movl (%reg2,%reg1,4), %reg2 and is therefore less efficient in terms of code footprint. But in general, the x86_64 ISA and psABI are quite flexible in this regard, and extrapolating from past experiences with PIC code on i386 is not really justified here. As Andi also pointed out, what ultimately matters is the performance, as well as code size where it impacts performance, through the I-cache footprint. I'll do some testing before reposting, and maybe not bother if the impact is negative. > > It also brings us much closer to the ordinary PIE relocation model used > > for most of user space, which is therefore much better supported and > > less likely to create problems as we increase the range of compilers and > > linkers that need to be supported. > > We have been resisting *for ages* making the kernel worse to accomodate > broken compilers. We don't "need" to support more compilers -- we need > the compilers to support us. We have working compilers; any new compiler > that wants to play should be expected to work correctly. > We are in a much better place now than we were before in that regard, which is actually how this effort came about: instead of lying to the compiler, and maintaining our own pile of scripts and relocation tools, we can just do what other arches are doing in Linux, and let the toolchain do it for us.