Re: [PATCH v10 24/40] arm64/signal: Expose GCS state in signal frames

Dave Martin <Dave.Martin@xxxxxxx> · Thu, 15 Aug 2024 16:33:25 +0100

On Thu, Aug 15, 2024 at 04:05:32PM +0100, Mark Brown wrote:
> On Thu, Aug 15, 2024 at 03:01:21PM +0100, Dave Martin wrote:
> 
> > My thought was that if libc knows about shadow stacks, it is probably
> > going to be built to use them too and so would enable shadow stack
> > during startup to protect its own code.
> 
> > (Technically those would be independent decisions, but it seems a good
> > idea to use a hardening feature if you know about and it is present.)
> 
> > If so, shadow stacks might always get turned on before the main program
> > gets a look-in.
> 
> > Or is that not the expectation?
> 
> The expectation (at least for arm64) is that the main program will only
> have shadow stacks if everything says it can support them.  If the
> dynamic linker turns them on during startup prior to parsing the main
> executables this means that it should turn them off before actually
> starting the executable, taking care to consider any locking of features.

Hmm, so we really do get a clear "enable shadow stack" call to the
kernel, which we can reasonaly expect won't happen for ancient software?

If so, I think dumping the GCS state in the sigframe could be made
conditional on that without problems (?)

(We could always make it unconditional later if it turn out that that
approach breaks something.)

> 
> > > > Is there any scenario where it is legitimate for the signal handler to
> > > > change the shadow stack mode or to return with an altered GCSPR_EL0?
> 
> > > If userspace can rewrite the stack pointer on return (eg, to return to a
> > > different context as part of userspace threading) then it will need to
> 
> > Do we know if code that actually does that?  IIUC, trying to do this is
> > totally broken on most arches nowadays; making it work requires a
> > reentrant C library and/or logic to defer signals around critical
> > sections in userspace.
> 
> > "Real" threading makes this pretty pointless for the most part.
> 
> > Related question: does shadow stack work with ucontext-based coroutines?
> > Per-context stacks need to be allocated by the program for that.
> 
> Yes, ucontext based coroutines are the sort of thing I meant when I was
> talking about returning to a different context?  

Ah, right.  Doing this asynchronously on the back of a signal (instead
of doing a sigreturn) is the bad thing.  setcontext() officially
doesn't work for this any more, and doing it by hacking or rebuilding
the sigframe is extremely hairy and probably a terrible idea for the
reasons I gave.

> > > be able to also update GCSPR_EL0 to something consistent otherwise
> > > attempting to return from the interrupted context isn't going to go
> > > well.  Changing the mode is a bit more exotic, as it is in general.
> > > It's as much to provide information to the signal handler as anything
> > > else.

Note, the way sigcontext (a.k.a. mcontext).__reserved[] is used by
glibc for the ucontext API is inspired by the way the kernel uses it,
but not guaranteed to be compatible.  For the ucontext API glibc
doesn't try to store/restore asynchronous contexts (which is why
setcontext() from a signal handler is totally broken), so there is no
need to store SVE/SME state and hence lots of free space, so this
probably is supportable with shadow stacks -- if there's a way to
allocate them.  This series would be unaffected either way.

(IIRC, the contents of mcontext.__reserved[] is totally incompatible
with what the kernel puts in there, and doesn't have the same record
structure.)

> 
> > I'm not sure that we should always put information in the signal frame
> > that the signal handler can't obtain directly.
> 
> > I guess it's harmless to include this, though.
> 
> If we don't include it then if different ucontexts have different GCS
> features enabled we run into trouble on context switch.

As outlined above, nowadays you can only use setcontext() on a context
obtained from getcontext().  Using setcontext() on a context obtained
from a sigframe works by accident or not at all, but in any case
coroutines always switch synchronously and don't rely on doing this.

(See where setcontext deals with the FPSIMD regs:
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/aarch64/setcontext.S;h=ba659438c564dc3bbbb8d6039030e2c492649534;hb=HEAD )

So, overall I think making ucontext coroutines with with GCS is purely
a libc matter that is "interesting" here, but we don't need to worry
about.

> > > > Is the guarded stack considered necessary (or at least beneficial) for
> > > > backtracing, or is the regular stack sufficient?
> 
> > > It's potentially beneficial, being less vulnerable to corruption and
> > > simpler to parse if all you're interested in is return addresses.
> > > Profiling in particular was mentioned, grabbing a linear block of memory
> > > will hopefully be less overhead than chasing down the stack.  The
> > > regular stack should generally be sufficient though.
> 
> > I guess we can't really argue that the shadow stack pointer is
> > redundant here though.  The whole point of shadow stacks is to make
> > things more robust...
> 
> > Just kicking the tyres on the question of whether we need it here, but
> > I guess it's hard to make a good case for saying "no".
> 
> Indeed.  The general model here is that we don't break userspace that
> relies on parses the normal stack (so the GCS is never *necessary*) but
> clearly you want to have it.

Agreed, but perhaps not in programs that haven't enabled shadow stack?

Cheers
---Dave