On Tue, Oct 4, 2022 at 3:56 PM Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> wrote: > > On Mon, 2022-10-03 at 15:23 -0700, Kees Cook wrote: > > On Thu, Sep 29, 2022 at 03:29:25PM -0700, Rick Edgecombe wrote: > > > [...] > > > The following example demonstrates how to create a new shadow stack > > > with > > > map_shadow_stack: > > > void *shstk = map_shadow_stack(adrr, stack_size, > > > SHADOW_STACK_SET_TOKEN); > > > > typo: addr > > Yep, thanks. > > > > > > > [...] > > > +451 common map_shadow_stack sys_map_shadow_stac > > > k > > > > Isn't this "64", not "common"? > > Yes, this should have been changed after dropping 32 bit. We don't support ia32. But this is used for x32 which is supported. > > > > > [...] > > > +#define SHADOW_STACK_SET_TOKEN 0x1 /* Set up a restore token > > > in the shadow stack */ > > > > I think this should get an intro comment, like: > > > > /* Flags for map_shadow_stack(2) */ > > > > Also, as with the other UAPI fields, please use "(1ULL << 0)" here. > > Ok. > > > > > > @@ -62,24 +63,34 @@ static int create_rstor_token(unsigned long > > > ssp, unsigned long *token_addr) > > > if (write_user_shstk_64((u64 __user *)addr, (u64)ssp)) > > > return -EFAULT; > > > > > > - *token_addr = addr; > > > + if (token_addr) > > > + *token_addr = addr; > > > > > > return 0; > > > } > > > > > > > Can this just be collapsed into the patch that introduces > > create_rstor_token()? > > I mean, yea, that would be simpler. Breaking the changes apart was left > over from when the signals placed a token, but didn't need this extra > bit of functionality. > > > > > > -static unsigned long alloc_shstk(unsigned long size) > > > +static unsigned long alloc_shstk(unsigned long addr, unsigned long > > > size, > > > + unsigned long token_offset, bool > > > set_res_tok) > > > { > > > int flags = MAP_ANONYMOUS | MAP_PRIVATE; > > > struct mm_struct *mm = current->mm; > > > - unsigned long addr, unused; > > > + unsigned long mapped_addr, unused; > > > > > > mmap_write_lock(mm); > > > - addr = do_mmap(NULL, addr, size, PROT_READ, flags, > > > > Oops, I missed in the other patch that "addr" was being passed here. > > (uninitialized?) > > Argh, yes. I'll initialize in that patch and remove it here. > > > > > > - VM_SHADOW_STACK | VM_WRITE, 0, &unused, NULL); > > > - > > > + mapped_addr = do_mmap(NULL, addr, size, PROT_READ, flags, > > > + VM_SHADOW_STACK | VM_WRITE, 0, &unused, > > > NULL); > > > > I don't see do_mmap() doing anything here to avoid remapping a prior > > vma > > as shstk. Is the intention to allow userspace to convert existing > > VMAs? > > This has caused pain in the past, perhaps force MAP_FIXED_NOREPLACE ? > > No that is not the intention. It should fail and MAP_FIXED_NOREPLACE > looks like it will fit the bill. Thanks! > > > > > > [...] > > > @@ -174,6 +185,7 @@ int shstk_alloc_thread_stack(struct task_struct > > > *tsk, unsigned long clone_flags, > > > > > > > > > stack_size = PAGE_ALIGN(stack_size); > > > + addr = alloc_shstk(0, stack_size, 0, false); > > > if (IS_ERR_VALUE(addr)) > > > return PTR_ERR((void *)addr); > > > > > > > As mentioned earlier, I was expecting this patch to replace a > > (missing) > > call to alloc_shstk. i.e. expecting: > > > > - addr = alloc_shstk(stack_size); > > > > > @@ -395,6 +407,26 @@ int shstk_disable(void) > > > return 0; > > > } > > > > > > + > > > +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned > > > long, size, unsigned int, flags) > > > > Please add kern-doc for this, with some notes. E.g. at least one > > thing isn't immediately > > obvious, maybe more: "addr" must be a multiple of 8. > > Ok. > > > > > > +{ > > > + unsigned long aligned_size; > > > + > > > + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) > > > + return -ENOSYS; > > > > This needs to explicitly reject unknown flags[1], or expanding them > > in the > > future becomes very painful: > > > > if (flags & ~(SHADOW_STACK_SET_TOKEN)) > > return -EINVAL; > > > > > > [1] > > https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension > > > > Ok, good idea. > > > > + > > > + /* > > > + * An overflow would result in attempting to write the restore > > > token > > > + * to the wrong location. Not catastrophic, but just return the > > > right > > > + * error code and block it. > > > + */ > > > + aligned_size = PAGE_ALIGN(size); > > > + if (aligned_size < size) > > > + return -EOVERFLOW; > > > > The intention here is to allow userspace to ask for _less_ than a > > page > > size multiple, and to put the restore token there? > > > > Is it worth adding a check for size >= 8 here? Or, I guess it would > > just > > immediately crash on the next call? > > Funny you should ask... The glibc changes were doing this and then > looking for the token at the end of the length that it passed (not the > page aligned length). I had changed the kernel at one point to be page > aligned and then had the fun of debugging the results. I thought, glibc > is just wasting shadow stack. It should ask for page aligned shadow > stacks. But HJ argued that the kernel shouldn't second guess what > userspace is asking for based on HW page size details that don't have > to do with the software interface. I was convinced by that argument, > even though glibc is still wasting space. > > I could still be convinced the other way though. Glibc still has time > to (and should) change. But yea, that was actually the intention. Glibc requests a shadow stack of a given size and expects the restore token at the specific location. This is how glibc uses the restore token to switch to the new shadow stack. > > > > > + > > > + return alloc_shstk(addr, aligned_size, size, flags & > > > SHADOW_STACK_SET_TOKEN); > > > +} > > > > -- H.J.