On 1/30/22 13:18, Rick Edgecombe wrote: > Shadow stack accesses are writes from handle_mm_fault() perspective. So to > generate the correct PTE, maybe_mkwrite() will rely on the presence of > VM_SHADOW_STACK or VM_WRITE in the vma. > > In future patches, when VM_SHADOW_STACK is actually creatable by > userspace, a problem could happen if a user calls > mprotect( , , PROT_WRITE) on VM_SHADOW_STACK shadow stack memory. The code > would then be confused in the event of shadow stack accesses, and create a > writable PTE for a shadow stack access. Then the process would fault in a > loop. > > Prevent this from happening by blocking this kind of memory (VM_WRITE and > VM_SHADOW_STACK) from being created, instead of complicating the fault > handler logic to handle it. > > Add an x86 arch_validate_flags() implementation to handle the check. > Rename the uapi/asm/mman.h header guard to be able to use it for > arch/x86/include/asm/mman.h where the arch_validate_flags() will be. It would be great if this also said: There is an existing arch_validate_flags() hook for mmap() and mprotect() which allows architectures to reject unwanted ->vm_flags combinations. Add an implementation for x86. That's somewhat implied from what is there already, but making it more clear would be nice. There's a much higher bar to add a new arch hook than to just implement an existing one. > diff --git a/arch/x86/include/asm/mman.h b/arch/x86/include/asm/mman.h > new file mode 100644 > index 000000000000..b44fe31deb3a > --- /dev/null > +++ b/arch/x86/include/asm/mman.h > @@ -0,0 +1,21 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef _ASM_X86_MMAN_H > +#define _ASM_X86_MMAN_H > + > +#include <linux/mm.h> > +#include <uapi/asm/mman.h> > + > +#ifdef CONFIG_X86_SHADOW_STACK > +static inline bool arch_validate_flags(unsigned long vm_flags) > +{ > + if ((vm_flags & VM_SHADOW_STACK) && (vm_flags & VM_WRITE)) > + return false; > + > + return true; > +} The design decision here seems to be that VM_SHADOW_STACK is itself a pseudo-VM_WRITE flag. Like you said: "Shadow stack accesses are writes from handle_mm_fault()". Very early on, this series seems to have made the decision that shadow stacks are writable and need lots of write handling behavior, *BUT* shouldn't have VM_WRITE set. As a whole, that seems odd. The alternative would be *requiring* VM_WRITE and VM_SHADOW_STACK be set together. I guess the downside is that pte_mkwrite() would need to be made to work on shadow stack PTEs. That particular design decision was never discussed. I think it has a really big impact on the rest of the series. What do you think? Was it a good idea? Or would the alternative be more complicated than what you have now?