On Mon, 2020-09-14 at 17:12 -0700, Yu, Yu-cheng wrote: > On 9/14/2020 7:50 AM, Dave Hansen wrote: > > On 9/11/20 3:59 PM, Yu-cheng Yu wrote: > > ... > > > Here are the changes if we take the mprotect(PROT_SHSTK) approach. > > > Any comments/suggestions? > > > > I still don't like it. :) > > > > I'll also be much happier when there's a proper changelog to accompany > > this which also spells out the alternatives any why they suck so much. [...] I revised it. If this turns out needing more work/discussion, we can split it out from the shadow stack series. Thanks, Yu-cheng ------ >From 114f3108207e3eb15fe13b3ff43d1ac273b5072a Mon Sep 17 00:00:00 2001 From: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx> Date: Thu, 10 Sep 2020 16:41:30 -0700 Subject: [PATCH] mm: Introduce PROT_SHSTK for shadow stack There are three possible options to create a shadow stack allocation API: an arch_prctl, a new syscall, or adding PROT_SHSTK to mmap()/mprotect(). Each has its advantages and compromises. An arch_prctl() is the least intrusive. However, the existing x86 arch_prctl() takes only two parameters. Multiple parameters must be passed in a memory buffer. There is a proposal to pass more parameters in registers [1], but no active discussion on that. A new syscall minimizes compatibility issues and offers an extensible frame work to other architectures, but this will likely result in some overlap of mmap()/mprotect(). The introduction of PROT_SHSTK to mmap()/mprotect() takes advantage of existing APIs. The x86-specific PROT_SHSTK is translated to VM_SHSTK, and a shadow stack mapping is created without reinventing the wheel. There are potential pitfalls though. The most obvious one would be using this as a bypass to shadow stack protection. However, the attacker would have to get to the syscall first. Since arch_calc_vm_prot_bits() is modified, I have moved arch_vm_get_page _prot() and arch_calc_vm_prot_bits() to x86/include/asm/mman.h. This will be more consistent with other architectures. [1] https://lore.kernel.org/lkml/20200828121624.108243-1-hjl.tools@xxxxxxxxx/ Signed-off-by: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx> --- arch/x86/include/asm/mman.h | 63 ++++++++++++++++++++++++++++++++ arch/x86/include/uapi/asm/mman.h | 28 ++------------ mm/mprotect.c | 10 +++++ 3 files changed, 77 insertions(+), 24 deletions(-) create mode 100644 arch/x86/include/asm/mman.h diff --git a/arch/x86/include/asm/mman.h b/arch/x86/include/asm/mman.h new file mode 100644 index 000000000000..2f19e429c9e2 --- /dev/null +++ b/arch/x86/include/asm/mman.h @@ -0,0 +1,63 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_MMAN_H +#define _ASM_X86_MMAN_H + +#include <asm/cpufeature.h> +#include <uapi/asm/mman.h> + +#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS +/* + * Take the 4 protection key bits out of the vma->vm_flags + * value and turn them in to the bits that we can put in + * to a pte. + * + * Only override these if Protection Keys are available + * (which is only on 64-bit). + */ +#define arch_vm_get_page_prot(vm_flags) __pgprot( \ + ((vm_flags) & VM_PKEY_BIT0 ? _PAGE_PKEY_BIT0 : 0) | \ + ((vm_flags) & VM_PKEY_BIT1 ? _PAGE_PKEY_BIT1 : 0) | \ + ((vm_flags) & VM_PKEY_BIT2 ? _PAGE_PKEY_BIT2 : 0) | \ + ((vm_flags) & VM_PKEY_BIT3 ? _PAGE_PKEY_BIT3 : 0)) + +#define pkey_vm_prot_bits(prot, key) ( \ + ((key) & 0x1 ? VM_PKEY_BIT0 : 0) | \ + ((key) & 0x2 ? VM_PKEY_BIT1 : 0) | \ + ((key) & 0x4 ? VM_PKEY_BIT2 : 0) | \ + ((key) & 0x8 ? VM_PKEY_BIT3 : 0)) +#else +#define pkey_vm_prot_bits(prot, key) (0) +#endif + +static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, + unsigned long pkey) +{ + unsigned long vm_prot_bits = pkey_vm_prot_bits(prot, pkey); + + vm_prot_bits |= ((prot & PROT_SHSTK) ? VM_SHSTK : 0); + return vm_prot_bits; +} +#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) + +static inline bool arch_validate_prot(unsigned long prot, unsigned long addr) +{ + unsigned long supported = PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM; + + if (IS_ENABLED(CONFIG_X86_INTEL_SHADOW_STACK_USER) && + static_cpu_has(X86_FEATURE_SHSTK) && (prot & PROT_SHSTK)) { + supported |= PROT_SHSTK; + + /* + * A shadow stack mapping is indirectly writable by only + * the CALL and WRUSS instructions, but not other write + * instructions). PROT_SHSTK and PROT_WRITE are mutually + * exclusive. + */ + supported &= ~PROT_WRITE; + } + + return (prot & ~supported) == 0; +} +#define arch_validate_prot arch_validate_prot + +#endif /* _ASM_X86_MMAN_H */ diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h index d4a8d0424bfb..39bb7db344a6 100644 --- a/arch/x86/include/uapi/asm/mman.h +++ b/arch/x86/include/uapi/asm/mman.h @@ -1,31 +1,11 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -#ifndef _ASM_X86_MMAN_H -#define _ASM_X86_MMAN_H +#ifndef _UAPI_ASM_X86_MMAN_H +#define _UAPI_ASM_X86_MMAN_H #define MAP_32BIT 0x40 /* only give out 32bit addresses */ -#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS -/* - * Take the 4 protection key bits out of the vma->vm_flags - * value and turn them in to the bits that we can put in - * to a pte. - * - * Only override these if Protection Keys are available - * (which is only on 64-bit). - */ -#define arch_vm_get_page_prot(vm_flags) __pgprot( \ - ((vm_flags) & VM_PKEY_BIT0 ? _PAGE_PKEY_BIT0 : 0) | \ - ((vm_flags) & VM_PKEY_BIT1 ? _PAGE_PKEY_BIT1 : 0) | \ - ((vm_flags) & VM_PKEY_BIT2 ? _PAGE_PKEY_BIT2 : 0) | \ - ((vm_flags) & VM_PKEY_BIT3 ? _PAGE_PKEY_BIT3 : 0)) - -#define arch_calc_vm_prot_bits(prot, key) ( \ - ((key) & 0x1 ? VM_PKEY_BIT0 : 0) | \ - ((key) & 0x2 ? VM_PKEY_BIT1 : 0) | \ - ((key) & 0x4 ? VM_PKEY_BIT2 : 0) | \ - ((key) & 0x8 ? VM_PKEY_BIT3 : 0)) -#endif +#define PROT_SHSTK 0x10 /* shadow stack pages */ #include <asm-generic/mman.h> -#endif /* _ASM_X86_MMAN_H */ +#endif /* _UAPI_ASM_X86_MMAN_H */ diff --git a/mm/mprotect.c b/mm/mprotect.c index a8edbcb3af99..27a1b7a4e89c 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -571,6 +571,16 @@ static int do_mprotect_pkey(unsigned long start, size_t len, goto out; } } + + /* + * Only anonymous mapping is suitable for shadow stack. A function + * call stack should not be backed by a file. + */ + if ((prot & PROT_SHSTK) && (vma->vm_file)) { + error = -EINVAL; + goto out; + } + if (start > vma->vm_start) prev = vma; -- 2.21.0