On 6/23/2017 5:00 AM, Borislav Petkov wrote: > On Fri, Jun 16, 2017 at 01:56:19PM -0500, Tom Lendacky wrote: >> Add the support to encrypt the kernel in-place. This is done by creating >> new page mappings for the kernel - a decrypted write-protected mapping >> and an encrypted mapping. The kernel is encrypted by copying it through >> a temporary buffer. >> >> Signed-off-by: Tom Lendacky <thomas.lendacky at amd.com> >> --- >> arch/x86/include/asm/mem_encrypt.h | 6 + >> arch/x86/mm/Makefile | 2 >> arch/x86/mm/mem_encrypt.c | 314 ++++++++++++++++++++++++++++++++++++ >> arch/x86/mm/mem_encrypt_boot.S | 150 +++++++++++++++++ >> 4 files changed, 472 insertions(+) >> create mode 100644 arch/x86/mm/mem_encrypt_boot.S >> >> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h >> index af835cf..7da6de3 100644 >> --- a/arch/x86/include/asm/mem_encrypt.h >> +++ b/arch/x86/include/asm/mem_encrypt.h >> @@ -21,6 +21,12 @@ >> >> extern unsigned long sme_me_mask; >> >> +void sme_encrypt_execute(unsigned long encrypted_kernel_vaddr, >> + unsigned long decrypted_kernel_vaddr, >> + unsigned long kernel_len, >> + unsigned long encryption_wa, >> + unsigned long encryption_pgd); >> + >> void __init sme_early_encrypt(resource_size_t paddr, >> unsigned long size); >> void __init sme_early_decrypt(resource_size_t paddr, >> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile >> index 9e13841..0633142 100644 >> --- a/arch/x86/mm/Makefile >> +++ b/arch/x86/mm/Makefile >> @@ -38,3 +38,5 @@ obj-$(CONFIG_NUMA_EMU) += numa_emulation.o >> obj-$(CONFIG_X86_INTEL_MPX) += mpx.o >> obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o >> obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o >> + >> +obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o >> diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c >> index 842c8a6..6e87662 100644 >> --- a/arch/x86/mm/mem_encrypt.c >> +++ b/arch/x86/mm/mem_encrypt.c >> @@ -24,6 +24,8 @@ >> #include <asm/setup.h> >> #include <asm/bootparam.h> >> #include <asm/set_memory.h> >> +#include <asm/cacheflush.h> >> +#include <asm/sections.h> >> >> /* >> * Since SME related variables are set early in the boot process they must >> @@ -209,8 +211,320 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned long size) >> set_memory_decrypted((unsigned long)vaddr, size >> PAGE_SHIFT); >> } >> >> +static void __init sme_clear_pgd(pgd_t *pgd_base, unsigned long start, >> + unsigned long end) >> +{ >> + unsigned long pgd_start, pgd_end, pgd_size; >> + pgd_t *pgd_p; >> + >> + pgd_start = start & PGDIR_MASK; >> + pgd_end = end & PGDIR_MASK; >> + >> + pgd_size = (((pgd_end - pgd_start) / PGDIR_SIZE) + 1); >> + pgd_size *= sizeof(pgd_t); >> + >> + pgd_p = pgd_base + pgd_index(start); >> + >> + memset(pgd_p, 0, pgd_size); >> +} >> + >> +#ifndef CONFIG_X86_5LEVEL >> +#define native_make_p4d(_x) (p4d_t) { .pgd = native_make_pgd(_x) } >> +#endif > > Huh, why isn't this in arch/x86/include/asm/pgtable_types.h in the #else > branch of #if CONFIG_PGTABLE_LEVELS > 4 ? Normally the __p4d() macro would be used and that would be ok whether CONFIG_X86_5LEVEL is defined or not. But since __p4d() is part of the paravirt ops path I have to use native_make_p4d(). I'd be the only user of the function and thought it would be best to localize it this way. > > Also > > ERROR: Macros with complex values should be enclosed in parentheses > #105: FILE: arch/x86/mm/mem_encrypt.c:232: > +#define native_make_p4d(_x) (p4d_t) { .pgd = native_make_pgd(_x) } > > so why isn't it a function? I can define it as an inline function. > >> + >> +#define PGD_FLAGS _KERNPG_TABLE_NOENC >> +#define P4D_FLAGS _KERNPG_TABLE_NOENC >> +#define PUD_FLAGS _KERNPG_TABLE_NOENC >> +#define PMD_FLAGS (__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL) >> + >> +static void __init *sme_populate_pgd(pgd_t *pgd_base, void *pgtable_area, >> + unsigned long vaddr, pmdval_t pmd_val) >> +{ >> + pgd_t *pgd_p; >> + p4d_t *p4d_p; >> + pud_t *pud_p; >> + pmd_t *pmd_p; >> + >> + pgd_p = pgd_base + pgd_index(vaddr); >> + if (native_pgd_val(*pgd_p)) { >> + if (IS_ENABLED(CONFIG_X86_5LEVEL)) > > Err, I don't understand: so this is a Kconfig symbol and when it is > enabled at build time, you do a 5level pagetable. > > But you can't stick a 5level pagetable to a hardware which doesn't know > about it. True, 5-level will only be turned on for specific hardware which is why I originally had this as only 4-level pagetables. But in a comment from you back on the v5 version you said it needed to support 5-level. I guess we should have discussed this more, but I also thought that should our hardware ever support 5-level paging in the future then this would be good to go. > > Or do you mean that p4d layer folding at runtime to happen? (I admit, I > haven't looked at that in detail.) But then I'd hope that the generic > macros/functions would give you the ability to not care whether we have > a p4d or not and not add a whole bunch of ifdeffery to this code. The macros work great if you are not running identity mapped. You could use p*d_offset() to move easily through the tables, but those functions use __va() to generate table virtual addresses. I've seen where boot/compressed/pagetable.c #defines __va() to work with identity mapped pages but that would only work if I create a separate file just for this function. Given when this occurs it's very similar to what __startup_64() does in regards to the IS_ENABLED(CONFIG_X86_5LEVEL) checks. Thanks, Tom > > Hmmm. >