Re: [RFC PATCH v2 4/4] x86/vdso: Add __vdso_sgx_enter_enclave() to wrap SGX enclave transitions

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Fri, 7 Dec 2018 15:33:57 -0800

On Fri, Dec 7, 2018 at 1:26 PM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Fri, Dec 07, 2018 at 12:16:59PM -0800, Andy Lutomirski wrote:
> >
> >
> > > On Dec 7, 2018, at 12:09 PM, Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:
> > >
> > >> On Fri, Dec 07, 2018 at 11:23:10AM -0800, Andy Lutomirski wrote:
> > >>
> > >> Ah, I see. You’re saying that, if the non-enclave stare is corrupted such
> > >> that RIP  is okay and RSP still points somewhere reasonable but the return
> > >> address is garbage, then we can at least get to the fault handler and print
> > >> something?
> > >
> > > Yep.  Even for something more subtle like GPR corruption it could dump the
> > > entire call stack before attempting to return back up.
> > >
> > >> This only works if the fault handler pointer itself is okay, though, which
> > >> somewhat limits the usefulness, given that its pointer is quite likely to
> > >> be on the stack very close to the return address.
> > >
> > > Yeah, it's not a silver bullet by any means, but it does seem useful for at
> > > least some scenarios.  Even exploding when invoking the handler instead of
> > > at a random point might prove useful, e.g. "calling my exit handler exploded,
> > > maybe my enclave corrupted the stack!".
> >
> > Here’s another idea: calculate some little hash or other checksum of
> > RSP, RBP, and perhaps a couple words on the stack, and do:
>
> Corrupting RSP and RBP as opposed to the stack memory seems much less
> likely since the enclave would have to poke into the save state area.
> And as much as I dislike the practice of intentionally manipulating
> SSA.RSP, preventing the user from doing something because we're "helping"
> doesn't seem right.
>
> > call __vdso_enclave_corrupted_state
> >
> > If you get a mismatch after return. That function could be:
> >
> > call __vdso_enclave_corrupted_state:
> >   ud2
> >
> > And now the debug trace makes it very clear what happened.
> >
> > This may or may not be worth the effort.
>
> Running a checksum on the stack for every exit doesn't seem like it'd
> be worth the effort, especially since this type of bug should be quite
> rare, at least in production environments.
>
> If we want to pursue the checksum idea I think the easiest approach
> would be to combine it with an exit_handler and do a simple check on
> the handler.  It'd be minimal overhead in the fast path and would flag
> cases where invoking exit_handle() would explode, while deferring all
> other checks to the user.

How about this variant?

#define MAGIC 0xaaaabbbbccccddddul
#define RETADDR_HASH ((unsigned long)__builtin_return_address(0) ^ MAGIC)

void foo(void)
{
    volatile unsigned long hash = RETADDR_HASH;

    /* placeholder for your actual code */
    asm volatile ("nop");

    if (hash != RETADDR_HASH)
        asm volatile ("ud2");
}

But I have a real argument for dropping exit_handler: in this new age
of Spectre, the indirect call is a retpoline, and it's therefore quite
slow.  So I'm not saying NAK, but I do think it's unnecessary.

I don't suppose you've spent a bunch of time programming in the
continuation-passing style? :)