On Thu, Oct 10, 2024 at 05:32:05PM -0700, Deepak Gupta wrote: > +unsigned long alloc_shstk(unsigned long addr, unsigned long size, > + unsigned long token_offset, bool set_res_tok); > +int shstk_setup(void); > +int create_rstor_token(unsigned long ssp, unsigned long *token_addr); > +bool cpu_supports_shadow_stack(void); The cpu_ naming is confusing in an arm64 context, we use cpu_ for functions that report if a feature is supported on the current CPU and system_ for functions that report if a feature is enabled on the system. > +void set_thread_shstk_status(bool enable); It might be better if this took the flags that the prctl() takes? It feels like > +/* Flags for map_shadow_stack(2) */ > +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ > + We've also got SHADOW_STACK_SET_MARKER now. > +bool cpu_supports_shadow_stack(void) > +{ > + return arch_cpu_supports_shadow_stack(); > +} > + > +bool is_shstk_enabled(struct task_struct *task) > +{ > + return arch_is_shstk_enabled(task); > +} Do we need these wrappers (or could they just be static inlines in the header)? > +void set_thread_shstk_status(bool enable) > +{ > + arch_set_thread_shstk_status(enable); > +} arm64 can return an error here, we reject a bunch of conditions like 32 bit threads and locked enable status. > +unsigned long adjust_shstk_size(unsigned long size) > +{ > + if (size) > + return PAGE_ALIGN(size); > + > + return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); > +} static? > +/* > + * VM_SHADOW_STACK will have a guard page. This helps userspace protect > + * itself from attacks. The reasoning is as follows: > + * > + * The shadow stack pointer(SSP) is moved by CALL, RET, and INCSSPQ. The > + * INCSSP instruction can increment the shadow stack pointer. It is the > + * shadow stack analog of an instruction like: > + * > + * addq $0x80, %rsp > + * > + * However, there is one important difference between an ADD on %rsp > + * and INCSSP. In addition to modifying SSP, INCSSP also reads from the > + * memory of the first and last elements that were "popped". It can be > + * thought of as acting like this: > + * > + * READ_ONCE(ssp); // read+discard top element on stack > + * ssp += nr_to_pop * 8; // move the shadow stack > + * READ_ONCE(ssp-8); // read+discard last popped stack element > + * > + * The maximum distance INCSSP can move the SSP is 2040 bytes, before > + * it would read the memory. Therefore a single page gap will be enough > + * to prevent any operation from shifting the SSP to an adjacent stack, > + * since it would have to land in the gap at least once, causing a > + * fault. This is all very x86 centric... > + if (create_rstor_token(mapped_addr + token_offset, NULL)) { > + vm_munmap(mapped_addr, size); > + return -EINVAL; > + } Bikeshedding but can we call the function create_shstk_token() instead? The rstor means absolutely nothing in an arm64 context. > +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) > +{ > + bool set_tok = flags & SHADOW_STACK_SET_TOKEN; > + unsigned long aligned_size; > + > + if (!cpu_supports_shadow_stack()) > + return -EOPNOTSUPP; > + > + if (flags & ~SHADOW_STACK_SET_TOKEN) > + return -EINVAL; This needs SHADOW_STACK_SET_MARKER for arm64. > + if (addr && (addr & (PAGE_SIZE - 1))) > + return -EINVAL; if (!PAGE_ALIGNED(addr)) > +int shstk_setup(void) > +{ This is half of the implementation of the prctl() for enabling shadow stacks. Looking at the arm64 implementation this rafactoring feels a bit awkward, we don't have the one flag at a time requiremet that x86 has and we structure things rather differently. I'm not sure that the arch_prctl() and prctl() are going to line up comfortably... > + struct thread_shstk *shstk = ¤t->thread.shstk; > + unsigned long addr, size; > + > + /* Already enabled */ > + if (is_shstk_enabled(current)) > + return 0; > + > + /* Also not supported for 32 bit */ > + if (!cpu_supports_shadow_stack() || > + (IS_ENABLED(CONFIG_X86_64) && in_ia32_syscall())) > + return -EOPNOTSUPP; We probably need a thread_supports_shstk(), arm64 has a similar check for not 32 bit threads and I noted an issue with needing this check elsewhere. > + /* > + * For CLONE_VFORK the child will share the parents shadow stack. > + * Make sure to clear the internal tracking of the thread shadow > + * stack so the freeing logic run for child knows to leave it alone. > + */ > + if (clone_flags & CLONE_VFORK) { > + set_shstk_base_size(tsk, 0, 0); > + return 0; > + } On arm64 we set the new thread's shadow stack pointer here, the logic around that can probably also be usefully factored out. > + /* > + * For !CLONE_VM the child will use a copy of the parents shadow > + * stack. > + */ > + if (!(clone_flags & CLONE_VM)) > + return 0; Here also.
Attachment:
signature.asc
Description: PGP signature