Re: [PATCH v28 21/22] x86/vdso: Implement a vDSO for Intel SGX enclave call

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 16, 2020 at 11:38:24PM +0200, Jarkko Sakkinen wrote:
> On Mon, 2020-03-16 at 10:01 -0400, Nathaniel McCallum wrote:
> > On Mon, Mar 16, 2020 at 9:56 AM Jarkko Sakkinen
> > <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote:
> > > On Sun, 2020-03-15 at 13:53 -0400, Nathaniel McCallum wrote:
> > > > On Sat, Mar 14, 2020 at 9:25 PM Jarkko Sakkinen
> > > > <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote:
> > > > > On Wed, Mar 11, 2020 at 01:30:07PM -0400, Nathaniel McCallum wrote:
> > > > > > Currently, the selftest has a wrapper around
> > > > > > __vdso_sgx_enter_enclave() which preserves all x86-64 ABI callee-saved
> > > > > > registers (CSRs), though it uses none of them. Then it calls this
> > > > > > function which uses %rbx but preserves none of the CSRs. Then it jumps
> > > > > > into an enclave which zeroes all these registers before returning.
> > > > > > Thus:
> > > > > > 
> > > > > >   1. wrapper saves all CSRs
> > > > > >   2. wrapper repositions stack arguments
> > > > > >   3. __vdso_sgx_enter_enclave() modifies, but does not save %rbx
> > > > > >   4. selftest zeros all CSRs
> > > > > >   5. wrapper loads all CSRs
> > > > > > 
> > > > > > I'd like to propose instead that the enclave be responsible for saving
> > > > > > and restoring CSRs. So instead of the above we have:
> > > > > >   1. __vdso_sgx_enter_enclave() saves %rbx
> > > > > >   2. enclave saves CSRs
> > > > > >   3. enclave loads CSRs
> > > > > >   4. __vdso_sgx_enter_enclave() loads %rbx
> > > > > > 
> > > > > > I know that lots of other stuff happens during enclave transitions,
> > > > > > but at the very least we could reduce the number of instructions
> > > > > > through this critical path.
> > > > > 
> > > > > What Jethro said and also that it is a good general principle to cut
> > > > > down the semantics of any vdso as minimal as possible.
> > > > > 
> > > > > I.e. even if saving RBX would make somehow sense it *can* be left
> > > > > out without loss in terms of what can be done with the vDSO.
> > > > 
> > > > Please read the rest of the thread. Sean and I have hammered out some
> > > > sensible and effective changes.
> > > 
> > > Have skimmed through that discussion but it comes down how much you get
> > > by obviously degrading some of the robustness. Complexity of the calling
> > > pattern is not something that should be emphasized as that is something
> > > that is anyway hidden inside the runtime.
> > 
> > My suggestions explicitly maintained robustness, and in fact increased
> > it. If you think we've lost capability, please speak with specificity
> > rather than in vague generalities. Under my suggestions we can:
> > 1. call the vDSO from C
> > 2. pass context to the handler
> > 3. have additional stack manipulation options in the handler
> > 
> > The cost for this is a net 2 additional instructions. No existing
> > capability is lost.
> 
> My vague generality in this case is just that the whole design
> approach so far has been to minimize the amount of wrapping to
> EENTER.

Yes and no.   If we wanted to minimize the amount of wrapping around the
vDSO's ENCLU then we wouldn't have the exit handler shenanigans in the
first place.  The whole process has been about balancing the wants of each
use case against the overall quality of the API and code.

> And since this has been kind of agreed by most of the
> stakeholders doing something against the chosen strategy is
> something I do hold some resistance.

Up until Nathaniel joined the party, the only stakeholder in terms of the
exit handler was the Intel SDK.  There was a general consensus to pass
registers as-is when there isn't a strong reason to do otherwise.  Note
that Nathaniel has also expressed approval of that approach.

So I think the question that needs to be answered is whether the benefits
of using %rcx instead of %rax to pass @leaf justify the "pass registers
as-is" guideline.  We've effectively already given this waiver for %rbx,
as the whole reason why the TCS is passed in on the stack instead of via
%rbx is so that it can be passed to the exit handler.  E.g. the vDSO
could take the TCS in %rbx and save it on the stack, but we're throwing
the baby out with the bathwater at that point.

The major benefits being that the vDSO would be callable from C and that
the kernel could define a legitimate prototype instead of a frankenstein
prototype that's half assembly and half C.  For me, those are significant
benefits and well worth the extra MOV, PUSH and POP.  For some use cases
it would eliminate the need for an assembly wrapper.  For runtimes that
need an assembly wrapper for whatever reason, it's probably still a win as
a well designed runtime can avoid register shuffling in the wrapper.  And
if there is a runtime that isn't covered by the above, it's at worst an
extra MOV.



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux