Re: RFC: userspace exception fixups

Andy Lutomirski <luto@xxxxxxxxxx> · Tue, 6 Nov 2018 13:41:53 -0800

On Tue, Nov 6, 2018 at 1:07 PM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
>
>
> > On Nov 6, 2018, at 1:00 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> >
> >> On 11/6/18 12:12 PM, Andy Lutomirski wrote:
> >> True, but what if we have a nasty enclave that writes to memory just
> >> below SP *before* decrementing SP?
> >
> > Yeah, that would be unfortunate.  If an enclave did this (roughly):
> >
> >    1. EENTER
> >    2. Hardware sets eenter_hwframe->sp = %sp
> >    3. Enclave runs... wants to do out-call
> >    4. Enclave sets up parameters:
> >        memcpy(&eenter_hwframe->sp[-offset], arg1, size);
> >        ...
> >    5. Enclave sets eenter_hwframe->sp -= offset
> >
> > If we got a signal between 4 and 5, we'd clobber the copy of 'arg1' that
> > was on the stack.  The enclave could easily fix this by moving ->sp first.
> >
> > But, this is one of those "fun" parts of the ABI that I think we need to
> > talk about.  If we do this, we also basically require that the code
> > which handles asynchronous exits must *not* write to the stack.  That's
> > not hard because it's typically just a single ERESUME instruction, but
> > it *is* a requirement.
> >
>
> I was assuming that the async exit stuff was completely hidden by the API. The AEP code would decide whether the exit got fixed up by the kernel (which may or may not be easy to tell — can the code even tell without kernel help whether it was, say, an IRQ vs #UD?) and then either do ERESUME or cause sgx_enter_enclave() to return with an appropriate return value.
>
>

Sean, how does the current SDK AEX handler decide whether to do
EENTER, ERESUME, or just bail and consider the enclave dead?  It seems
like the *CPU* could give a big hint, but I don't see where there is
any architectural indication of why the AEX code got called or any
obvious way for the user code to know whether the exit was fixed up by
the kernel?