Re: [RFC PATCH 4/4] x86/vdso: x86/sgx: Allow the user to exit the vDSO loop on interrupts

Andy Lutomirski <luto@xxxxxxxxxx> · Sat, 22 Aug 2020 14:55:15 -0700

On Thu, Aug 20, 2020 at 10:53 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote:
>
> On 2020-08-20 19:44, Andy Lutomirski wrote:
> > On Thu, Aug 20, 2020 at 4:20 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote:
> >>
> >> On 2020-08-19 17:02, Andy Lutomirski wrote:
> >>> On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote:
> >>>>
> >>>> On 2020-08-18 19:15, Andy Lutomirski wrote:
> >>>>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson
> >>>>> <sean.j.christopherson@xxxxxxxxx> wrote:
> >>>>>>
> >>>>>> Allow userspace to exit the vDSO on interrupts that are acknowledged
> >>>>>> while the enclave is active.  This allows the user's runtime to switch
> >>>>>> contexts at opportune times without additional overhead, e.g. when using
> >>>>>> an M:N threading model (where M user threads run N TCSs, with N > M).
> >>>>>
> >>>>> This is IMO rather odd.  We don't support this type of notification on
> >>>>> interrupts for normal user code.  The fact user code can detect
> >>>>> interrupts during enclave execution is IMO an oddity of SGX, and I
> >>>>> have asked Intel to consider replacing the AEX mechanism with
> >>>>> something more transparent to user mode.  If this ever happens, this
> >>>>> mechanism is toast.
> >>>>
> >>>> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it.
> >>>
> >>> No.  If Intel fixes the architecture, the proposed interface will
> >>> *stop working*.
> >>>
> >>>>
> >>>>> Even without architecture changes, building a *reliable* M:N threading
> >>>>> mechanism on top of this will be difficult or impossible, as there is
> >>>>> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus
> >>>>> triggering an AEX.  We certainly don't, and probably never will,
> >>>>> support any corresponding feature for non-enclave code.
> >>>>
> >>>> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call).
> >>>
> >>> If you want to abort an enclave, abort it the same way you abort any
> >>> other user code -- send a signal or something.  If something is wrong> with signal handling in the proposed vDSO interface, then by all
> >>> means, let's fix it.  If we need better library signal support, let's
> >>> add such a thing.  If you really want to abort an enclave *cleanly*
> >>> without any chance of garbage left around, stick it in a subprocess.
> >>> We are not playing the game where someone sets a flag, crosses their
> >>> fingers, and waits for an interrupt.
> >>
> >> Sending a signal is not sufficient. The current __vdso_sgx_enter_enclave call is not interruptible.
> >>
> >
> > Why not?  If we are failing to deliver signals if
> > __vdso_sgx_enter_enclave() is running, we have a bug and we should fix
> > it.  We should actually deliver the signal and then either resume the
> > enclave or return -EINTR or equivalent.  Are we getting this wrong?
> > Looking at your code, I think we're doing it right but maybe we could
> > do better.
> >
> > I suppose we also need a test case for this if we do anything fancy here.
> >
> >>> Now maybe I'm wrong and there's an actual legitimate use case for this
> >>> trick, in which case I'd like to be enlightened.  But M:N threading
> >>> and enclave aborting don't sound like legitimate use cases.
> >>>
> >>
> >> Why is it ok for this code to return to userspace after 1 second:
> >>
> >>
> >>
> >> void ignore_signal_but_dont_restart(int s) {}
> >>
> >> // set SIGALRM handler if none set (FIXME: racy)
> >> struct sigaction sa_old;
> >> struct sigaction sa = {
> >>     .sa_handler = ignore_signal_but_dont_restart,
> >>     .sa_flags = 0,
> >> };
> >> sigaction(SIGALRM, &sa, &sa_old);
> >> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
> >>     || (sa_old.sa_flags & SA_SIGINFO)) {
> >>     sa_old.sa_flags &= ~SA_RESTART;
> >>     sigaction(SIGALRM, &sa_old, NULL);
> >> }
> >>
> >> alarm(1);
> >>
> >> char buf;
> >> read(0, &buf, 1);
> >>
> >
> > Seems fine.  That's POSIX.  User code is receiving a signal and is
> > being affected by the signal.  A signal is a user-visible thing.
> >
> >>
> >>
> >> But, according to your train of thought, this code must hang indefinitely (assuming the enclave is not calling EEXIT)?
> >>
> >>
> >>
> >> void ignore_signal_but_dont_restart(int s) {}
> >>
> >> // set SIGALRM handler if none set (FIXME: racy)
> >> struct sigaction sa_old;
> >> struct sigaction sa = {
> >>     .sa_handler = ignore_signal_but_dont_restart,
> >>     .sa_flags = 0,
> >> };
> >> sigaction(SIGALRM, &sa, &sa_old);
> >> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL)
> >>     || (sa_old.sa_flags & SA_SIGINFO)) {
> >>     sa_old.sa_flags &= ~SA_RESTART;
> >>     sigaction(SIGALRM, &sa_old, NULL);
> >> }
> >>
> >> alarm(1);
> >>
> >> __vdso_sgx_enter_enclave(0, 0, 0, 2, 0, 0, tcs, NULL, NULL);
> >>
> >
> > This is a genuine semantic question: is __vdso_sgx_enter_enclave()
> > like read on a pipe (returns -EINTR on a signal) or is it more like a
> > restartable syscall or a normal library function that just keeps
> > running if your signal handler does nothing?  You could siglongjmp()
> > out, but that's a bit gross.
> >
> > I wouldn't object to an option to __vdso_sgx_enter_enclave() to make
> > it return -EINTR if signaled by a non-SA_RESTART signal.  Implementing
> > it might be distinctly nontrivial, though.
> >
> > But this isn't what this patch does, and I suspect we've been talking
> > past each other.  This patch makes __vdso_sgx_enter_enclave() return
> > if there's an *interrupt*.  If you push a keyboard button, move your
> > mouse, get a network interrupt on that core, etc, it will return.
> > This is nonsense.
>
> It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important.

Allow me to quote the changelog of the patch:

This allows the user's runtime to switch
contexts at opportune times without additional overhead, e.g. when using
an M:N threading model (where M user threads run N TCSs, with N > M).

NAK.

>
> It sounds like you're saying you want to subdivide AEXs into “interrupts that lead to user-observable signals” and “other interrupts”, and then hide the second category from the user. I wouldn't object to that, but I don't know how to code this. It seems like a lot of work compared to the obvious solution (this patch).

The obvious solution is NAKked.

Does siglongjmp() really not work for you?

--Andy