On 9/14/2020 7:50 AM, Dave Hansen wrote:
On 9/11/20 3:59 PM, Yu-cheng Yu wrote:
...
Here are the changes if we take the mprotect(PROT_SHSTK) approach.
Any comments/suggestions?
I still don't like it. :)
I'll also be much happier when there's a proper changelog to accompany
this which also spells out the alternatives any why they suck so much.
diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h
index d4a8d0424bfb..024f006fcfe8 100644
--- a/arch/x86/include/uapi/asm/mman.h
+++ b/arch/x86/include/uapi/asm/mman.h
@@ -4,6 +4,8 @@
#define MAP_32BIT 0x40 /* only give out 32bit addresses */
+#define PROT_SHSTK 0x10 /* shadow stack pages */
+
#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
/*
* Take the 4 protection key bits out of the vma->vm_flags
@@ -19,13 +21,35 @@
((vm_flags) & VM_PKEY_BIT2 ? _PAGE_PKEY_BIT2 : 0) | \
((vm_flags) & VM_PKEY_BIT3 ? _PAGE_PKEY_BIT3 : 0))
-#define arch_calc_vm_prot_bits(prot, key) ( \
+#define pkey_vm_prot_bits(prot, key) ( \
((key) & 0x1 ? VM_PKEY_BIT0 : 0) | \
((key) & 0x2 ? VM_PKEY_BIT1 : 0) | \
((key) & 0x4 ? VM_PKEY_BIT2 : 0) | \
((key) & 0x8 ? VM_PKEY_BIT3 : 0))
+#else
+#define pkey_vm_prot_bits(prot, key)
#endif
My inner compiler doesn't think this will compile:
( | shstk_vm_prot_bits(prot))
+#define shstk_vm_prot_bits(prot) ( \
+ (static_cpu_has(X86_FEATURE_SHSTK) && (prot & PROT_SHSTK)) ? \
+ VM_SHSTK : 0)
Why do you need to filter PROT_SHSTK twice. Won't the prot passed in
here be filtered by arch_validate_prot()?
X86_FEATURE_SHSTK is already checked by arch_validate_prot() and may be
skipped. PROT_SHSTK needs to be tested and translated to VM_SHSTK here.
+#define arch_calc_vm_prot_bits(prot, key) \
+ (pkey_vm_prot_bits(prot, key) | shstk_vm_prot_bits(prot))
+
IMNHO, this is eminently more readable if you do:
#define arch_calc_vm_prot_bits(prot, key) \
(shstk_vm_prot_bits(prot)) \
pkey_vm_prot_bits(prot, key))
BTW, can these be static inlines? I forget if I had a good reason for
making them #defines.
Probably, at compile time, ((key) & 0x1 ? (VM_PKEY_BIT0 : 0) cannot be
optimized out, unless there is a #define? Since we are adding the
shadow stack stuff, why don't we make it static inline?
Also do we want to check if the cpu has PKEY?
[...]
diff --git a/mm/mprotect.c b/mm/mprotect.c
index a8edbcb3af99..520bd8caa005 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -571,6 +571,17 @@ static int do_mprotect_pkey(unsigned long start, size_t
len,
goto out;
}
}
+
+ /*
+ * Only anonymous mapping is suitable for shadow stack.
+ */
Why?
When a file mapping is turned into shadow stack, then the logic is
strange. For instance, when a CALL triggers a shadow stack page fault,
do we page in from a file? At least this seems to defeat shadow stack.