Re: [RFC PATCH 4/4] x86/vdso: x86/sgx: Allow the user to exit the vDSO loop on interrupts

Jethro Beekman <jethro@xxxxxxxxxxxx> · Thu, 20 Aug 2020 19:53:32 +0200

On 2020-08-20 19:44, Andy Lutomirski wrote:
> On Thu, Aug 20, 2020 at 4:20 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote:
>>
>> On 2020-08-19 17:02, Andy Lutomirski wrote:
>>> On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote:
>>>>
>>>> On 2020-08-18 19:15, Andy Lutomirski wrote:
>>>>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
>>>>> <sean.j.christopherson@xxxxxxxxx> wrote:
>>>>>>
>>>>>> Allow userspace to exit the vDSO on interrupts that are acknowledged
>>>>>> while the enclave is active.  This allows the user's runtime to switch
>>>>>> contexts at opportune times without additional overhead, e.g. when using
>>>>>> an M:N threading model (where M user threads run N TCSs, with N > M).
>>>>>
>>>>> This is IMO rather odd.  We don't support this type of notification on
>>>>> interrupts for normal user code.  The fact user code can detect
>>>>> interrupts during enclave execution is IMO an oddity of SGX, and I
>>>>> have asked Intel to consider replacing the AEX mechanism with
>>>>> something more transparent to user mode.  If this ever happens, this
>>>>> mechanism is toast.
>>>>
>>>> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.
>>>
>>> No.  If Intel fixes the architecture, the proposed interface will
>>> *stop working*.
>>>
>>>>
>>>>> Even without architecture changes, building a *reliable* M:N threading
>>>>> mechanism on top of this will be difficult or impossible, as there is
>>>>> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
>>>>> triggering an AEX.  We certainly don't, and probably never will,
>>>>> support any corresponding feature for non-enclave code.
>>>>
>>>> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).
>>>
>>> If you want to abort an enclave, abort it the same way you abort any
>>> other user code -- send a signal or something.  If something is wrong> with signal handling in the proposed vDSO interface, then by all
>>> means, let's fix it.  If we need better library signal support, let's
>>> add such a thing.  If you really want to abort an enclave *cleanly*
>>> without any chance of garbage left around, stick it in a subprocess.
>>> We are not playing the game where someone sets a flag, crosses their
>>> fingers, and waits for an interrupt.
>>
>> Sending a signal is not sufficient. The current __vdso_sgx_enter_enclave call is not interruptible.
>>
> 
> Why not?  If we are failing to deliver signals if
> __vdso_sgx_enter_enclave() is running, we have a bug and we should fix
> it.  We should actually deliver the signal and then either resume the
> enclave or return -EINTR or equivalent.  Are we getting this wrong?
> Looking at your code, I think we're doing it right but maybe we could
> do better.
> 
> I suppose we also need a test case for this if we do anything fancy here.
> 
>>> Now maybe I'm wrong and there's an actual legitimate use case for this
>>> trick, in which case I'd like to be enlightened.  But M:N threading
>>> and enclave aborting don't sound like legitimate use cases.
>>>
>>
>> Why is it ok for this code to return to userspace after 1 second:
>>
>>
>>
>> void ignore_signal_but_dont_restart(int s) {}
>>
>> // set SIGALRM handler if none set (FIXME: racy)
>> struct sigaction sa_old;
>> struct sigaction sa = {
>>     .sa_handler = ignore_signal_but_dont_restart,
>>     .sa_flags = 0,
>> };
>> sigaction(SIGALRM, &sa, &sa_old);
>> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
>>     || (sa_old.sa_flags & SA_SIGINFO)) {
>>     sa_old.sa_flags &= ~SA_RESTART;
>>     sigaction(SIGALRM, &sa_old, NULL);
>> }
>>
>> alarm(1);
>>
>> char buf;
>> read(0, &buf, 1);
>>
> 
> Seems fine.  That's POSIX.  User code is receiving a signal and is
> being affected by the signal.  A signal is a user-visible thing.
> 
>>
>>
>> But, according to your train of thought, this code must hang indefinitely (assuming the enclave is not calling EEXIT)?
>>
>>
>>
>> void ignore_signal_but_dont_restart(int s) {}
>>
>> // set SIGALRM handler if none set (FIXME: racy)
>> struct sigaction sa_old;
>> struct sigaction sa = {
>>     .sa_handler = ignore_signal_but_dont_restart,
>>     .sa_flags = 0,
>> };
>> sigaction(SIGALRM, &sa, &sa_old);
>> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
>>     || (sa_old.sa_flags & SA_SIGINFO)) {
>>     sa_old.sa_flags &= ~SA_RESTART;
>>     sigaction(SIGALRM, &sa_old, NULL);
>> }
>>
>> alarm(1);
>>
>> __vdso_sgx_enter_enclave(0, 0, 0, 2, 0, 0, tcs, NULL, NULL);
>>
> 
> This is a genuine semantic question: is __vdso_sgx_enter_enclave()
> like read on a pipe (returns -EINTR on a signal) or is it more like a
> restartable syscall or a normal library function that just keeps
> running if your signal handler does nothing?  You could siglongjmp()
> out, but that's a bit gross.
> 
> I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
> it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
> it might be distinctly nontrivial, though.
> 
> But this isn't what this patch does, and I suspect we've been talking
> past each other.  This patch makes __vdso_sgx_enter_enclave() return
> if there's an *interrupt*.  If you push a keyboard button, move your
> mouse, get a network interrupt on that core, etc, it will return.
> This is nonsense.

It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important. 

It sounds like you're saying you want to subdivide AEXs into “interrupts that lead to user-observable signals” and “other interrupts”, and then hide the second category from the user. I wouldn't object to that, but I don't know how to code this. It seems like a lot of work compared to the obvious solution (this patch).

--
Jethro Beekman | Fortanix

Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature