Re: [musl] Re: [PATCH v8 00/38] arm64/gcs: Provide support for GCS in userspace

"dalias@xxxxxxxx" <dalias@xxxxxxxx> · Wed, 21 Feb 2024 15:18:37 -0500

On Wed, Feb 21, 2024 at 07:22:21PM +0000, Edgecombe, Rick P wrote:
> On Wed, 2024-02-21 at 14:06 -0500, dalias@xxxxxxxx wrote:
> > Due to arbitrarily nestable signal frames, no, this does not suffice.
> > An interrupted operation using the lock could be arbitrarily delayed,
> > even never execute again, making any call to dlopen deadlock.
> 
> Doh! Yep, it is not robust to this. The only thing that could be done
> would be a timeout in dlopen(). Which would make the whole thing just
> better than nothing.
> 
> > 
> > > 
> > 
> > It's fine to turn RDSSP into an actual emulated read of the SSP, or
> > at
> > least an emulated load of zero so that uninitialized data is not left
> > in the target register.
> 
> We can't intercept RDSSP, but it becomes a NOP by default. (disclaimer
> x86-only knowledge).

OK, then I think the contract just has to be that userspace, in a
process that might dynamically disable shadow stack, needs to do
something like xor %reg,%reg before rdssp so that the outcome is
deterministic in disabled case.

> >  If doing the latter, code working with the
> > shadow stack just needs to be prepared for the possibility that it
> > could be async-disabled, and check the return value.
> > 
> > I have not looked at all the instructions that become #UD but I
> > suspect they all have reasonable trivial ways to implement a
> > "disabled" version of them that userspace can act upon reasonably.
> 
> This would have to be thought through functionally and performance
> wise. I'm not opposed if can come up with a fully fleshed out plan. How
> serious are you in pursuing musl support, if we had something like
> this?

Up til this thread, my position was pretty much "nope" because it
looked like it could not be done in a way compatible with existing
interface requirements.

However, what's been discussed here, contingent on a dynamic-disable
(ideally allowing choice of per-thread or global, to minimize loss of
hardening properties),

Personally, I believe shadow stack has very low hardening value
relative to cost/complexity, and my leaning would be just to ignore
it. However, I also know it becomes marketing pressure, including
pressure on distros that use musl -- "Why doesn't [distro] do shadow
stack?? I thought you were security oriented!!!" -- and if it can be
done in a non-breaking and non-invasive way, I think it's reasonable
to pursue and make something work.

> HJ, any thoughts on whether glibc would use this as well?
> 
> It is probably worth mentioning that from the security side (as Mark
> mentioned there is always tension in the tradeoffs on these features),
> permissive mode is seen by some as something that weakens security too
> much. Apps could call dlopen() on a known unsupported DSO before doing
> ROP. I don't know if you have any musl users with specific shadow stack
> use cases to ask about this.

Yes, this is potentially an argument for something like the option 2,
if there's a way to leave SS enabled but then trap when something goes
wrong, detect if it went wrong via SS-incompatible library code, and
lazily disable SS, otherwise terminate.

But I just realized, I'm not even sure why shared libraries need to be
explicitly SS-compatible. Unless they're doing their own asm stack
switches, shouldn't they just work by default? And since I don't
understand this reason, I also don't understand what the failure mode
is when an incompatible library is loaded, and thus whether it would
be possible to detect and attribute the failure to the library, or
whether the library would induce failure somewhere else.

Anyway, a mechanism to allow the userspace implementation to disable
SS doesn't inherently expose a means to do that. A system integrator
doing maximum hardening might choose to build all libraries as
SS-compatible, or to patch the loader to refuse to load incompatible
libraries.

Rich