Re: [PATCH v11 25/39] arm64/signal: Expose GCS state in signal frames

Mark Brown <broonie@xxxxxxxxxx> · Fri, 23 Aug 2024 23:01:13 +0100

On Fri, Aug 23, 2024 at 04:59:11PM +0100, Catalin Marinas wrote:
> On Fri, Aug 23, 2024 at 11:25:30AM +0100, Mark Brown wrote:

> > We could store either the cap token or the interrupted GCSPR_EL0 (the
> > address below the cap token).  It felt more joined up to go with the cap
> > token since notionally signal return is consuming the cap token but
> > either way would work, we could just add an offset when looking at the
> > pointer.

> In a hypothetical sigaltshadowstack() scenario, would the cap go on the
> new signal shadow stack or on the old one? I assume on the new one but
> in sigcontext we'd save the original GCSPR_EL0. In such hypothetical
> case, the original GCSPR_EL0 would not need 8 subtracted.

I would have put the token on the old stack since that's what we'd be
returning to.  This raises interesting questions about what happens if
the reason for the signal is that we just overflowed the normal stack
(which are among the issues that have got in the way of working out if
or how we do something with sigaltshadowstack).  I'm not clear what the
purpose of the token would be on the new stack, the token basically says
"this is somewhere we can sigreturn to", that's not the case for the
alternative stack.

> I need to think some more about this. The gcs_restore_signal() function
> makes sense, it starts with the current GCSPR_EL0 on the signal stack
> and consumes the token, adds 8 to the shadow stack pointer. The
> restore_gcs_context() one is confusing as it happens before consuming
> the cap token and assumes that the GCSPR_EL0 value actually points to
> the signal stack. If we ever implement an alternative shadow stack, the
> original GCSPR_EL0 of the interrupted context would be lost. I know it's
> not planned for now but the principles should be the same. The
> sigframe.uc should store the interrupted state.

I think the issues you're pointing out here go to the thing with the cap
token marking a place we can sigreturn to and therefore being on the
original stack.

> To me the order for sigreturn should be first to consume the cap token,
> validate it etc. and then restore GCSPR_EL0 to whatever was saved in the
> sigframe.uc prior to the signal being delivered.

To me what we're doing here is that the signal frame says where
userspace wants to point GCSPR_EL0 in the returned context, we then go
and confirm that this is a valid address by looking at it and checking
for a token.  The token serves to validate what was saved in sigframe.uc
so that it can't just be pointed at some random address.

> > restore_gcs_context() is loading values from the signal frame in memory
> > (which will only happen if a GCS context is present) then
> > gcs_restore_signal() consumes the token at the top of the stack.  The
> > split is because userspace can skip the restore_X_context() functions
> > for the optional signal frame elements by removing them from the context
> > but we want to ensure that we always consume a token.

> I agree we should always consume a token but this should be done from
> the actual hardware GCSPR_EL0 value on the sigreturn call rather than
> the one restored from sigframe.uc. The restoring should be the last
> step.

If we look for a token at the GCSPR_EL0 on sigreturn then it wouldn't be
valid to call sigreturn() (well, without doing something first to unwind
the stack which seems hostile and would require code to become GCS aware
that wouldn't otherwise), if you do a sigreturn from somewhere other
than the vDSO then you'd have at least the vDSO signal frame left in the
GCS.  You'd also be able to point GCSPR_EL0 to anywhere since we'd just
load a value from the signal frame but not do the cap verification.
Attachment:
signature.asc

Description: PGP signature