On Mon, 2022-10-03 at 15:23 -0700, Kees Cook wrote: > On Thu, Sep 29, 2022 at 03:29:25PM -0700, Rick Edgecombe wrote: > > [...] > > The following example demonstrates how to create a new shadow stack > > with > > map_shadow_stack: > > void *shstk = map_shadow_stack(adrr, stack_size, > > SHADOW_STACK_SET_TOKEN); > > typo: addr Yep, thanks. > > > [...] > > +451 common map_shadow_stack sys_map_shadow_stac > > k > > Isn't this "64", not "common"? Yes, this should have been changed after dropping 32 bit. > > > [...] > > +#define SHADOW_STACK_SET_TOKEN 0x1 /* Set up a restore token > > in the shadow stack */ > > I think this should get an intro comment, like: > > /* Flags for map_shadow_stack(2) */ > > Also, as with the other UAPI fields, please use "(1ULL << 0)" here. Ok. > > > @@ -62,24 +63,34 @@ static int create_rstor_token(unsigned long > > ssp, unsigned long *token_addr) > > if (write_user_shstk_64((u64 __user *)addr, (u64)ssp)) > > return -EFAULT; > > > > - *token_addr = addr; > > + if (token_addr) > > + *token_addr = addr; > > > > return 0; > > } > > > > Can this just be collapsed into the patch that introduces > create_rstor_token()? I mean, yea, that would be simpler. Breaking the changes apart was left over from when the signals placed a token, but didn't need this extra bit of functionality. > > > -static unsigned long alloc_shstk(unsigned long size) > > +static unsigned long alloc_shstk(unsigned long addr, unsigned long > > size, > > + unsigned long token_offset, bool > > set_res_tok) > > { > > int flags = MAP_ANONYMOUS | MAP_PRIVATE; > > struct mm_struct *mm = current->mm; > > - unsigned long addr, unused; > > + unsigned long mapped_addr, unused; > > > > mmap_write_lock(mm); > > - addr = do_mmap(NULL, addr, size, PROT_READ, flags, > > Oops, I missed in the other patch that "addr" was being passed here. > (uninitialized?) Argh, yes. I'll initialize in that patch and remove it here. > > > - VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL); > > - > > + mapped_addr = do_mmap(NULL, addr, size, PROT_READ, flags, > > + VM_SHADOW_STACK | VM_WRITE, 0, &unused, > > NULL); > > I don't see do_mmap() doing anything here to avoid remapping a prior > vma > as shstk. Is the intention to allow userspace to convert existing > VMAs? > This has caused pain in the past, perhaps force MAP_FIXED_NOREPLACE ? No that is not the intention. It should fail and MAP_FIXED_NOREPLACE looks like it will fit the bill. Thanks! > > > [...] > > @@ -174,6 +185,7 @@ int shstk_alloc_thread_stack(struct task_struct > > *tsk, unsigned long clone_flags, > > > > > > stack_size = PAGE_ALIGN(stack_size); > > + addr = alloc_shstk(0, stack_size, 0, false); > > if (IS_ERR_VALUE(addr)) > > return PTR_ERR((void *)addr); > > > > As mentioned earlier, I was expecting this patch to replace a > (missing) > call to alloc_shstk. i.e. expecting: > > - addr = alloc_shstk(stack_size); > > > @@ -395,6 +407,26 @@ int shstk_disable(void) > > return 0; > > } > > > > + > > +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned > > long, size, unsigned int, flags) > > Please add kern-doc for this, with some notes. E.g. at least one > thing isn't immediately > obvious, maybe more: "addr" must be a multiple of 8. Ok. > > > +{ > > + unsigned long aligned_size; > > + > > + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) > > + return -ENOSYS; > > This needs to explicitly reject unknown flags[1], or expanding them > in the > future becomes very painful: > > if (flags & ~(SHADOW_STACK_SET_TOKEN)) > return -EINVAL; > > > [1] > https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension > Ok, good idea. > > + > > + /* > > + * An overflow would result in attempting to write the restore > > token > > + * to the wrong location. Not catastrophic, but just return the > > right > > + * error code and block it. > > + */ > > + aligned_size = PAGE_ALIGN(size); > > + if (aligned_size < size) > > + return -EOVERFLOW; > > The intention here is to allow userspace to ask for _less_ than a > page > size multiple, and to put the restore token there? > > Is it worth adding a check for size >= 8 here? Or, I guess it would > just > immediately crash on the next call? Funny you should ask... The glibc changes were doing this and then looking for the token at the end of the length that it passed (not the page aligned length). I had changed the kernel at one point to be page aligned and then had the fun of debugging the results. I thought, glibc is just wasting shadow stack. It should ask for page aligned shadow stacks. But HJ argued that the kernel shouldn't second guess what userspace is asking for based on HW page size details that don't have to do with the software interface. I was convinced by that argument, even though glibc is still wasting space. I could still be convinced the other way though. Glibc still has time to (and should) change. But yea, that was actually the intention. > > > + > > + return alloc_shstk(addr, aligned_size, size, flags & > > SHADOW_STACK_SET_TOKEN); > > +} > >