On Tue, Jun 14, 2022, Tom Lendacky wrote: > On 6/14/22 11:13, Sean Christopherson wrote: > > > > > This breaks SME on Rome and Milan when compiling with clang-13. I haven't been > > > > > able to figure out exactly what goes wrong. printk isn't functional at this point, > > > > > and interactive debug during boot on our test systems is beyond me. I can't even > > > > > verify that the bug is specific to clang because the draconian build system for our > > > > > test systems apparently is stuck pointing at gcc-4.9. > > > > > > > > > > I suspect the issue is related to relocation and/or encrypting memory, as skipping > > > > > the call to early_snp_set_memory_shared() if SNP isn't active masks the issue. > > > > > I've dug through the assembly and haven't spotted a smoking gun, e.g. no obvious > > > > > use of absolute addresses. > > > > > > > > > > Forcing a VM through the same path doesn't fail. I can't test an SEV guest at the > > > > > moment because INIT_EX is also broken. > > > > > > > > The SEV INIT_EX was a PEBKAC issue. An SEV guest boots just fine with a clang-built > > > > kernel, so either it's a finnicky relocation issue or something specific to SME. > > > > > > I just built and booted 5.19-rc2 with clang-13 and SME enabled without issue: > > > > > > [ 4.118226] Memory Encryption Features active: AMD SME > > > > Phooey. > > > > > Maybe something with your kernel config? Can you send me your config? > > > > Attached. If you can't repro, I'll find someone on our end to work on this. > > I was able to repro. It dies in the cc_platform_has() code, where it is > trying to do an indirect jump based on the attribute (actually in the > amd_cc_platform_has() which I think has been optimized in): > > bool cc_platform_has(enum cc_attr attr) ... > ffffffff81002160: ff 24 c5 c0 01 00 82 jmp *-0x7dfffe40(,%rax,8) > > This last line is what causes the reset. I'm guessing that the jump isn't > valid at this point because we are running in identity mapped mode and not > with a kernel virtual address at this point. > > Trying to see what the difference was between your config and mine, the > indirect jump lead me to check the setting of CONFIG_RETPOLINE. Your config > did not have it enabled, so I set CONFIG_RETPOLINE=y, and with that, the > kernel boots successfully. That would explain why my VMs didn't fail, I build those kernels with CONFIG_RETPOLINE=y. > With retpolines, the code is completely different around here: ... > I'm not sure if there's a way to remove the jump table optimization for > the arch/x86/coco/core.c file when retpolines aren't configured. And for post-boot I don't think we'd want to disable any such optimizations. A possibled "fix" would be to do what sme_encrypt_kernel() does and just query sev_status directly. But even that works, the fragility of the boot code is terrifying :-( I can't think of any clever solutions though. Many thanks again Tom! --- arch/x86/include/asm/sev.h | 4 ++++ arch/x86/kernel/head64.c | 10 +++++++--- arch/x86/kernel/sev.c | 16 +++++++++++----- 3 files changed, 22 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index 19514524f0f8..701c561fdf08 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -193,6 +193,8 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) void setup_ghcb(void); void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages); +void __init __early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, + unsigned int npages); void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages); void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op); @@ -214,6 +216,8 @@ static inline void setup_ghcb(void) { } static inline void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned int npages) { } static inline void __init +__early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { } +static inline void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, unsigned int npages) { } static inline void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { } static inline void snp_set_memory_shared(unsigned long vaddr, unsigned int npages) { } diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index bd4a34100ed0..5efab0d8e49d 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -127,7 +127,9 @@ static bool __head check_la57_support(unsigned long physaddr) } #endif -static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdval_t *pmd) +static unsigned long __head sme_postprocess_startup(struct boot_params *bp, + pmdval_t *pmd, + unsigned long physaddr) { unsigned long vaddr, vaddr_end; int i; @@ -156,7 +158,9 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv * address but the kernel is currently running off of the identity * mapping so use __pa() to get a *currently* valid virtual address. */ - early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), PTRS_PER_PMD); + if (sev_status & MSR_AMD64_SEV_SNP_ENABLED_BIT) + __early_snp_set_memory_shared(__pa(vaddr), __pa(vaddr), + PTRS_PER_PMD); i = pmd_index(vaddr); pmd[i] -= sme_get_me_mask(); @@ -316,7 +320,7 @@ unsigned long __head __startup_64(unsigned long physaddr, */ *fixup_long(&phys_base, physaddr) += load_delta - sme_get_me_mask(); - return sme_postprocess_startup(bp, pmd); + return sme_postprocess_startup(bp, pmd, physaddr); } /* Wipe all early page tables except for the kernel symbol map */ diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index c05f0124c410..48966ecc520e 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -714,12 +714,9 @@ void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long padd pvalidate_pages(vaddr, npages, true); } -void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, - unsigned int npages) +void __init __early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, + unsigned int npages) { - if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) - return; - /* Invalidate the memory pages before they are marked shared in the RMP table. */ pvalidate_pages(vaddr, npages, false); @@ -727,6 +724,15 @@ void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr early_set_pages_state(paddr, npages, SNP_PAGE_STATE_SHARED); } +void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, + unsigned int npages) +{ + if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) + return; + + __early_snp_set_memory_shared(vaddr, paddr, npages); +} + void __init snp_prep_memory(unsigned long paddr, unsigned int sz, enum psc_op op) { unsigned long vaddr, npages; base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3 --