Re: [PATCH RFT v7 9/9] selftests/clone3: Test shadow stack support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 07, 2024 at 01:39:27PM +0100, Mark Brown wrote:
> On Tue, Aug 06, 2024 at 10:08:44PM -0700, Kees Cook wrote:
> > On Tue, Aug 06, 2024 at 04:10:02PM +0100, Mark Brown wrote:
> 
> > > >   # Running test 'Shadow stack with no token'
> 
> > It took me a while to figure out where a thread switches shstk (even
> > without this series):
> 
> > kernel_clone, copy_process, copy_thread, fpu_clone, update_fpu_shstk
> > (and shstk_alloc_thread_stack is called just before update_fpu_shstk).
> 
> > I don't understand the token consumption in arch_shstk_post_fork(). This
> > wasn't needed before with the fixed-size new shstk, why is it needed
> > now?
> 
> Concerns were raised on earlier rounds of review that since instead of
> allocating the shadow stack as part of creating the new thread we are
> using a previously allocated shadow stack someone could use this as part
> of an exploit.  You could just jump on top of any existing shadow stack
> and cause writes to it.
> 
> > Anyway, my attempt to trace the shstk changes for the test:
> 
> > write(1, "TAP version 13\n", 15)        = 15
> > write(1, "1..2\n", 5)                   = 5
> > clone3({flags=0, exit_signal=18446744073709551615, stack=NULL, stack_size=0}, 104) = -1 EINVAL (Invalid argument)
> > write(1, "# clone3() syscall supported\n", 29) = 29
> > map_shadow_stack(NULL, 4096, 0)         = 125837480497152
> > write(1, "# Shadow stack supportd\n", 24) = 24
> > write(1, "# Running test 'Shadow stack wit"..., 44) = 44
> > getpid()                                = 4943
> > write(1, "# [4943] Trying clone3() with fl"..., 51) = 51
> > map_shadow_stack(NULL, 4096, 0)         = 125837480488960
> > clone3({flags=CLONE_VM, exit_signal=SIGCHLD, stack=NULL, stack_size=0, /* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"} => {/* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"}, 104) = 4944
> > getpid()                                = 4943
> > write(1, "# I am the parent (4943). My chi"..., 49strace: Process 4944 attached
> > ) = 49
> > [pid  4944] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_CPERR, si_addr=NULL} ---
> > [pid  4943] wait4(-1,  <unfinished ...>
> > [pid  4944] +++ killed by SIGSEGV (core dumped) +++
> 
> So we created the thread, then before we get to the wait4() in the
> parent we start delivering a SEGV_CPERR to the child.  The flow for the
> child is as expected.
> 
> > <... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 4944
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=4944, si_uid=0, si_status=SIGSEGV, si_utime=0, si_stime=0} ---
> > --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7272d21fffe8} ---
> > +++ killed by SIGSEGV (core dumped) +++
> 
> Then the parent gets an ordinary segfault, not a shadow stack specific
> one, like some memory got deallocated underneath it or a pointer got
> corrupted.
> 
> > [  569.153288] shstk_setup: clone3[4943] ssp:7272d2200000
> > [  569.153998] process: copy_thread: clone3[4943] new_ssp:7272d2530000
> > [  569.154002] update_fpu_shstk: clone3[4943] ssp:7272d2530000
> > [  569.154008] shstk_post_fork: clone3[4944]
> > [  569.154011] shstk_post_fork: clone3[4944] sending SIGSEGV post fork
> 
> > I don't see an update_fpu_shstk for 4944? Should I with this test?
> 
> I'd only expect to see one update, my understanding is that that update
> is for the child but happening in the context of the parent as the hild
> is not yet started.

What's weird here that I don't understand is that the parent is 4943, so
this report makes sense:

> > [  569.153288] shstk_setup: clone3[4943] ssp:7272d2200000

The child is 4944, yet I see:

> > [  569.153998] process: copy_thread: clone3[4943] new_ssp:7272d2530000
> > [  569.154002] update_fpu_shstk: clone3[4943] ssp:7272d2530000

These map to my logging:

copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
	...
	new_ssp = shstk_alloc_thread_stack(p, args);
	pr_err("%s: %s[%d] new_ssp:%lx\n", __func__, p->comm, task_pid_nr(p), new_ssp);

and

update_fpu_shstk(struct task_struct *dst, unsigned long ssp)
	...
        xstate->user_ssp = (u64)ssp;
	pr_err("%s: %s[%d] ssp:%lx\n", __func__, dst->comm, task_pid_nr(dst), ssp);

The child should be "p" (and "dst") here -- stuff is being copied from
current to p, but p is reporting itself as 4943 here? (Oh, this is
reporting pid, not tid... I bet that's what I've got wrong.)

> Does this help:
> 
> diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
> index 27acbdf44c5f..d7005974aff5 100644
> --- a/arch/x86/kernel/shstk.c
> +++ b/arch/x86/kernel/shstk.c
> @@ -258,6 +258,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *tsk,
>  	if (args->shadow_stack) {
>  		addr = args->shadow_stack;
>  		size = args->shadow_stack_size;
> +		shstk->base = 0;
> +		shstk->size = 0;
>  	} else {
>  		/*
>  		 * For CLONE_VFORK the child will share the parents

I'll fix my reporting and give this patch a try too. Thanks!

-Kees

-- 
Kees Cook




[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux