From: Ard Biesheuvel <ardb@xxxxxxxxxx> This is a follow-up to [0] which implemented rigorous build time checks to ensure that any code that is executed during early startup supports running from the initial 1:1 mapping of memory, which is how the kernel is entered from the decompressor or the EFI firmware. Using PIC codegen and introducing new magic sections into generic code would create a maintenance burden, and more experimentation is needed there. One issue with PIC codegen is that it still permits the compiler to make assumptions about the runtime address of global objects (modulo runtime relocation), which is incompatible with how the kernel is entered, i.e., running a fully linked and relocated executable from the wrong runtime address. The RIP_REL_REF() macro that was introduced recently [1] is actually more appropriate for this use case, as it hides the access from the compiler entirely, and so the compiler can never predict its result. To make incremental progress on this, this v5 drops the special instrumentation for .pi.text and PIC codegen, but retains all the cleanup work on the startup code to make it more maintainable and more obviously correct. In particular, this involves: - getting rid of early accesses to global objects, either by moving them to the stack, deferring the access until later, or dropping the globals entirely; - moving all code that runs early via the 1:1 mapping into .head.text, and moving code that does not out of it, so that build time checks can be added later to ensure that no inadvertent absolute references were emitted into code that does not tolerate them; - removing fixup_pointer() and occurrences of __pa_symbol(), which rely on the compiler emitting absolute references, and this is not guaranteed. (Without -fpic, the compiler might still use RIP-relative references in some cases) Changes since v4 [2]: - incorporate Boris's tweaked version of patch #1 - split __startup64() changes into multiple patches, and align more closely with the original logic - fix build for CONFIG_X86_5LEVEL=n - add comment to clarify that CR4.PSE is always set deliberately - add separate SME startup change to remove SME/SVE related calls from the non-SME/SVE boot path (this can be backported more easily further back than to where we need the changes for SVE guest boot) Changes since v3: - dropped half of the patches and added a couple of new ones - applied feedback from Boris to patches that were retained, mostly related to some minor oversights on my part, and to some style issues [0] https://lkml.kernel.org/r/20240129180502.4069817-21-ardb%2Bgit%40google.com [1] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/sev&id=1c811d403afd73f0 [2] https://lkml.kernel.org/r/20240213124143.1484862-13-ardb%2Bgit%40google.com Cc: Kevin Loughlin <kevinloughlin@xxxxxxxxxx> Cc: Tom Lendacky <thomas.lendacky@xxxxxxx> Cc: Dionna Glaze <dionnaglaze@xxxxxxxxxx> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Cc: Ingo Molnar <mingo@xxxxxxxxxx> Cc: Borislav Petkov <bp@xxxxxxxxx> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> Cc: Andy Lutomirski <luto@xxxxxxxxxx> Cc: Arnd Bergmann <arnd@xxxxxxxx> Cc: Nathan Chancellor <nathan@xxxxxxxxxx> Cc: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> Cc: Justin Stitt <justinstitt@xxxxxxxxxx> Cc: Kees Cook <keescook@xxxxxxxxxxxx> Cc: Brian Gerst <brgerst@xxxxxxxxx> Cc: linux-kernel@xxxxxxxxxxxxxxx Cc: linux-arch@xxxxxxxxxxxxxxx Cc: llvm@xxxxxxxxxxxxxxx Ard Biesheuvel (16): x86/startup_64: Simplify global variable accesses in GDT/IDT programming x86/startup_64: Use RIP_REL_REF() to assign phys_base x86/startup_64: Use RIP_REL_REF() to access early_dynamic_pgts[] x86/startup_64: Use RIP_REL_REF() to access __supported_pte_mask x86/startup_64: Use RIP_REL_REF() to access early page tables x86/startup_64: Use RIP_REL_REF() to access early_top_pgt[] x86/startup_64: Simplify CR4 handling in startup code x86/startup_64: Defer assignment of 5-level paging global variables x86/startup_64: Simplify calculation of initial page table address x86/startup_64: Simplify virtual switch on primary boot x86/sme: Avoid SME/SVE related checks on non-SME/SVE platforms efi/libstub: Add generic support for parsing mem_encrypt= x86/boot: Move mem_encrypt= parsing to the decompressor x86/sme: Move early SME kernel encryption handling into .head.text x86/sev: Move early startup code into .head.text section x86/startup_64: Drop global variables keeping track of LA57 state arch/x86/boot/compressed/misc.c | 15 ++ arch/x86/boot/compressed/misc.h | 4 - arch/x86/boot/compressed/pgtable_64.c | 12 -- arch/x86/boot/compressed/sev.c | 3 + arch/x86/boot/compressed/vmlinux.lds.S | 1 + arch/x86/include/asm/mem_encrypt.h | 8 +- arch/x86/include/asm/pgtable_64_types.h | 43 ++--- arch/x86/include/asm/setup.h | 2 +- arch/x86/include/asm/sev.h | 10 +- arch/x86/include/uapi/asm/bootparam.h | 1 + arch/x86/kernel/cpu/common.c | 2 - arch/x86/kernel/head64.c | 195 ++++++-------------- arch/x86/kernel/head_64.S | 95 ++++------ arch/x86/kernel/sev-shared.c | 23 +-- arch/x86/kernel/sev.c | 14 +- arch/x86/lib/Makefile | 13 -- arch/x86/mm/kasan_init_64.c | 3 - arch/x86/mm/mem_encrypt_identity.c | 89 +++------ drivers/firmware/efi/libstub/efi-stub-helper.c | 8 + drivers/firmware/efi/libstub/efistub.h | 2 +- drivers/firmware/efi/libstub/x86-stub.c | 3 + 21 files changed, 203 insertions(+), 343 deletions(-) base-commit: ee8ff8768735edc3e013837c4416f819543ddc17 -- 2.44.0.rc0.258.g7320e95886-goog