On Thu, Aug 20, 2020 at 10:53 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote: > > On 2020-08-20 19:44, Andy Lutomirski wrote: > > On Thu, Aug 20, 2020 at 4:20 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote: > >> > >> On 2020-08-19 17:02, Andy Lutomirski wrote: > >>> On Wed, Aug 19, 2020 at 7:21 AM Jethro Beekman <jethro@xxxxxxxxxxxx> wrote: > >>>> > >>>> On 2020-08-18 19:15, Andy Lutomirski wrote: > >>>>> On Mon, Aug 17, 2020 at 9:24 PM Sean Christopherson > >>>>> <sean.j.christopherson@xxxxxxxxx> wrote: > >>>>>> > >>>>>> Allow userspace to exit the vDSO on interrupts that are acknowledged > >>>>>> while the enclave is active. This allows the user's runtime to switch > >>>>>> contexts at opportune times without additional overhead, e.g. when using > >>>>>> an M:N threading model (where M user threads run N TCSs, with N > M). > >>>>> > >>>>> This is IMO rather odd. We don't support this type of notification on > >>>>> interrupts for normal user code. The fact user code can detect > >>>>> interrupts during enclave execution is IMO an oddity of SGX, and I > >>>>> have asked Intel to consider replacing the AEX mechanism with > >>>>> something more transparent to user mode. If this ever happens, this > >>>>> mechanism is toast. > >>>> > >>>> Let's design the current interface for the current architecture. We can deal with a new architecture if and when Intel provides it. > >>> > >>> No. If Intel fixes the architecture, the proposed interface will > >>> *stop working*. > >>> > >>>> > >>>>> Even without architecture changes, building a *reliable* M:N threading > >>>>> mechanism on top of this will be difficult or impossible, as there is > >>>>> no particular guarantee that a thread will get timing interrupts at> all or that these interrupts will get lucky and hit enclave code, thus > >>>>> triggering an AEX. We certainly don't, and probably never will, > >>>>> support any corresponding feature for non-enclave code. > >>>> > >>>> There's no guarantee, but this vDSO exit mechanism is a prerequisite. Both for context switching and aborting an enclave, userspace *must* have a way to trigger exit from enclave mode *and* recover the user stack in a sane manner. Userspace *should* also be able to do this in a way that's compatible with library use, so calling timer_create or pthread_kill to deliver a signal would be ok, but installing a signal handler should be avoided (the whole reason behind this vDSO call). > >>> > >>> If you want to abort an enclave, abort it the same way you abort any > >>> other user code -- send a signal or something. If something is wrong> with signal handling in the proposed vDSO interface, then by all > >>> means, let's fix it. If we need better library signal support, let's > >>> add such a thing. If you really want to abort an enclave *cleanly* > >>> without any chance of garbage left around, stick it in a subprocess. > >>> We are not playing the game where someone sets a flag, crosses their > >>> fingers, and waits for an interrupt. > >> > >> Sending a signal is not sufficient. The current __vdso_sgx_enter_enclave call is not interruptible. > >> > > > > Why not? If we are failing to deliver signals if > > __vdso_sgx_enter_enclave() is running, we have a bug and we should fix > > it. We should actually deliver the signal and then either resume the > > enclave or return -EINTR or equivalent. Are we getting this wrong? > > Looking at your code, I think we're doing it right but maybe we could > > do better. > > > > I suppose we also need a test case for this if we do anything fancy here. > > > >>> Now maybe I'm wrong and there's an actual legitimate use case for this > >>> trick, in which case I'd like to be enlightened. But M:N threading > >>> and enclave aborting don't sound like legitimate use cases. > >>> > >> > >> Why is it ok for this code to return to userspace after 1 second: > >> > >> > >> > >> void ignore_signal_but_dont_restart(int s) {} > >> > >> // set SIGALRM handler if none set (FIXME: racy) > >> struct sigaction sa_old; > >> struct sigaction sa = { > >> .sa_handler = ignore_signal_but_dont_restart, > >> .sa_flags = 0, > >> }; > >> sigaction(SIGALRM, &sa, &sa_old); > >> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL) > >> || (sa_old.sa_flags & SA_SIGINFO)) { > >> sa_old.sa_flags &= ~SA_RESTART; > >> sigaction(SIGALRM, &sa_old, NULL); > >> } > >> > >> alarm(1); > >> > >> char buf; > >> read(0, &buf, 1); > >> > > > > Seems fine. That's POSIX. User code is receiving a signal and is > > being affected by the signal. A signal is a user-visible thing. > > > >> > >> > >> But, according to your train of thought, this code must hang indefinitely (assuming the enclave is not calling EEXIT)? > >> > >> > >> > >> void ignore_signal_but_dont_restart(int s) {} > >> > >> // set SIGALRM handler if none set (FIXME: racy) > >> struct sigaction sa_old; > >> struct sigaction sa = { > >> .sa_handler = ignore_signal_but_dont_restart, > >> .sa_flags = 0, > >> }; > >> sigaction(SIGALRM, &sa, &sa_old); > >> if (!(sa_old.sa_handler == SIG_IGN || sa_old.sa_handler == SIG_DFL) > >> || (sa_old.sa_flags & SA_SIGINFO)) { > >> sa_old.sa_flags &= ~SA_RESTART; > >> sigaction(SIGALRM, &sa_old, NULL); > >> } > >> > >> alarm(1); > >> > >> __vdso_sgx_enter_enclave(0, 0, 0, 2, 0, 0, tcs, NULL, NULL); > >> > > > > This is a genuine semantic question: is __vdso_sgx_enter_enclave() > > like read on a pipe (returns -EINTR on a signal) or is it more like a > > restartable syscall or a normal library function that just keeps > > running if your signal handler does nothing? You could siglongjmp() > > out, but that's a bit gross. > > > > I wouldn't object to an option to __vdso_sgx_enter_enclave() to make > > it return -EINTR if signaled by a non-SA_RESTART signal. Implementing > > it might be distinctly nontrivial, though. > > > > But this isn't what this patch does, and I suspect we've been talking > > past each other. This patch makes __vdso_sgx_enter_enclave() return > > if there's an *interrupt*. If you push a keyboard button, move your > > mouse, get a network interrupt on that core, etc, it will return. > > This is nonsense. > > It's not nontrivial to return on signals, this patch does it. This patch *also* returns when there's a HW interrupt, but that's not important. Allow me to quote the changelog of the patch: This allows the user's runtime to switch contexts at opportune times without additional overhead, e.g. when using an M:N threading model (where M user threads run N TCSs, with N > M). NAK. > > It sounds like you're saying you want to subdivide AEXs into “interrupts that lead to user-observable signals” and “other interrupts”, and then hide the second category from the user. I wouldn't object to that, but I don't know how to code this. It seems like a lot of work compared to the obvious solution (this patch). The obvious solution is NAKked. Does siglongjmp() really not work for you? --Andy