On Tue, 2023-12-05 at 15:51 +0000, Mark Brown wrote: > On Tue, Dec 05, 2023 at 12:26:57AM +0000, Edgecombe, Rick P wrote: > > On Tue, 2023-11-28 at 18:22 +0000, Mark Brown wrote: > > > > - size = adjust_shstk_size(stack_size); > > > + size = adjust_shstk_size(size); > > > addr = alloc_shstk(0, size, 0, false); > > > Hmm. I didn't test this, but in the copy_process(), copy_mm() > > happens > > before this point. So the shadow stack would get mapped in > > current's MM > > (i.e. the parent). So in the !CLONE_VM case with > > shadow_stack_size!=0 > > the SSP in the child will be updated to an area that is not mapped > > in > > the child. I think we need to pass tsk->mm into alloc_shstk(). But > > such > > an exotic clone usage does give me pause, regarding whether all of > > this > > is premature. > > Hrm, right. And we then can't use do_mmap() either. I'd be somewhat > tempted to disallow that specific case for now rather than deal with > it > though that's not really in the spirit of just always following what > the > user asked for. Oh, yea. What a pain. It doesn't seem like we could easily even add a do_mmap() variant that takes an mm either. I did a quick logging test on a Fedora userspace. systemd (I think) appears to do a clone(!CLONE_VM) with a stack passed. So maybe the combo might actually get used with a shadow_stack_size if it used clone3 some day. At the same time, fixing clone to mmap() in the child doesn't seem straight forward at all. Checking with some of our MM folks, the suggestion was to look at doing the child's shadow stack mapping in dup_mm() to avoid tripping over complications that happen when a remote MM becomes more "live". If we just punt on this combination for now, then the documented rules for args->shadow_stack_size would be something like: clone3 will use the parents shadow stack when CLONE_VM is not present. If CLONE_VFORK is set then it will use the parents shadow stack only when args->shadow_stack_size is non-zero. In the cases when the parents shadow stack is not used, args->shadow_stack_size is used for the size whenever non-zero. I guess it doesn't seem too overly complicated. But I'm not thinking any of the options seem great. I'd unhappily lean towards not supporting shadow_stack_size!=0 && !CLONE_VM for now. But it seems like there may be a user for the unsupported case, so this would be just improving things a little and kicking the can down the road. I also wonder if this is a sign to reconsider the earlier token consuming design.