Re: [PATCH RFT v7 9/9] selftests/clone3: Test shadow stack support

Mark Brown <broonie@xxxxxxxxxx> · Wed, 7 Aug 2024 13:39:27 +0100

On Tue, Aug 06, 2024 at 10:08:44PM -0700, Kees Cook wrote:
> On Tue, Aug 06, 2024 at 04:10:02PM +0100, Mark Brown wrote:

> > >   # Running test 'Shadow stack with no token'

> It took me a while to figure out where a thread switches shstk (even
> without this series):

> kernel_clone, copy_process, copy_thread, fpu_clone, update_fpu_shstk
> (and shstk_alloc_thread_stack is called just before update_fpu_shstk).

> I don't understand the token consumption in arch_shstk_post_fork(). This
> wasn't needed before with the fixed-size new shstk, why is it needed
> now?

Concerns were raised on earlier rounds of review that since instead of
allocating the shadow stack as part of creating the new thread we are
using a previously allocated shadow stack someone could use this as part
of an exploit.  You could just jump on top of any existing shadow stack
and cause writes to it.

> Anyway, my attempt to trace the shstk changes for the test:

> write(1, "TAP version 13\n", 15)        = 15
> write(1, "1..2\n", 5)                   = 5
> clone3({flags=0, exit_signal=18446744073709551615, stack=NULL, stack_size=0}, 104) = -1 EINVAL (Invalid argument)
> write(1, "# clone3() syscall supported\n", 29) = 29
> map_shadow_stack(NULL, 4096, 0)         = 125837480497152
> write(1, "# Shadow stack supportd\n", 24) = 24
> write(1, "# Running test 'Shadow stack wit"..., 44) = 44
> getpid()                                = 4943
> write(1, "# [4943] Trying clone3() with fl"..., 51) = 51
> map_shadow_stack(NULL, 4096, 0)         = 125837480488960
> clone3({flags=CLONE_VM, exit_signal=SIGCHLD, stack=NULL, stack_size=0, /* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"} => {/* bytes 88..103 */ "\x00\xf0\x52\xd2\x72\x72\x00\x00\x00\x10\x00\x00\x00\x00\x00\x00"}, 104) = 4944
> getpid()                                = 4943
> write(1, "# I am the parent (4943). My chi"..., 49strace: Process 4944 attached
> ) = 49
> [pid  4944] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_CPERR, si_addr=NULL} ---
> [pid  4943] wait4(-1,  <unfinished ...>
> [pid  4944] +++ killed by SIGSEGV (core dumped) +++

So we created the thread, then before we get to the wait4() in the
parent we start delivering a SEGV_CPERR to the child.  The flow for the
child is as expected.

> <... wait4 resumed>[{WIFSIGNALED(s) && WTERMSIG(s) == SIGSEGV && WCOREDUMP(s)}], __WALL, NULL) = 4944
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_DUMPED, si_pid=4944, si_uid=0, si_status=SIGSEGV, si_utime=0, si_stime=0} ---
> --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7272d21fffe8} ---
> +++ killed by SIGSEGV (core dumped) +++

Then the parent gets an ordinary segfault, not a shadow stack specific
one, like some memory got deallocated underneath it or a pointer got
corrupted.

> [  569.153288] shstk_setup: clone3[4943] ssp:7272d2200000
> [  569.153998] process: copy_thread: clone3[4943] new_ssp:7272d2530000
> [  569.154002] update_fpu_shstk: clone3[4943] ssp:7272d2530000
> [  569.154008] shstk_post_fork: clone3[4944]
> [  569.154011] shstk_post_fork: clone3[4944] sending SIGSEGV post fork

> I don't see an update_fpu_shstk for 4944? Should I with this test?

I'd only expect to see one update, my understanding is that that update
is for the child but happening in the context of the parent as the hild
is not yet started.

Does this help:

diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index 27acbdf44c5f..d7005974aff5 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -258,6 +258,8 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *tsk,
 	if (args->shadow_stack) {
 		addr = args->shadow_stack;
 		size = args->shadow_stack_size;
+		shstk->base = 0;
+		shstk->size = 0;
 	} else {
 		/*
 		 * For CLONE_VFORK the child will share the parents
Attachment:
signature.asc

Description: PGP signature