On Fri, Mar 13, 2020 at 12:46 PM Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote: > > On Fri, Mar 13, 2020 at 11:48:54AM -0400, Nathaniel McCallum wrote: > > Thinking about this more carefully, I still think that at least part > > of my critique still stands. > > > > __vdso_sgx_enter_enclave() doesn't use the x86-64 ABI. This means that > > there will always be an assembly wrapper for > > __vdso_sgx_enter_enclave(). But because __vdso_sgx_enter_enclave() > > doesn't save %rbx, the wrapper is forced to in order to be called from > > C. > > > > A common pattern for the wrapper will be to do something like this: > > > > # void enter_enclave(rdi, rsi, rdx, unused, r8, r9, @tcs, @e, > > @handler, @leaf, @vdso) > > enter_enclave: > > push %rbx > > push $0 /* align */ > > push 0x48(%rsp) > > push 0x48(%rsp) > > push 0x48(%rsp) > > > > mov 0x70(%rsp), %eax > > call *0x68(%rsp) > > > > add $0x20, %rsp > > pop %rbx > > ret > > > > Because __vdso_sgx_enter_enclave() doesn't preserve %rbx, the wrapper > > is forced to reposition stack parameters in a performance-critical > > path. On the other hand, if __vdso_sgx_enter_enclave() preserved %rbx, > > you could implement the above as: > > > > # void enter_enclave(rdi, rsi, rdx, unused, r8, r9, @tcs, @e, > > @handler, @leaf, @vdso) > > enter_enclave: > > mov 0x20(%rsp), %eax > > jmp *0x28(%rsp) > > > > This also implies that if __vdso_sgx_enter_enclave() took @leaf as a > > stack parameter and preserved %rbx, it would be x86-64 ABI compliant > > enough to call from C if the enclave preserves all callee-saved > > registers besides %rbx (Enarx does). > > > > What are the downsides of this approach? It also doesn't harm the more > > extended case when you need to use an assembly wrapper to setup > > additional registers. This can still be done. It does imply an extra > > push and mov instruction. But because there are currently an odd > > number of stack function parameters, the push also removes an > > alignment instruction where the stack is aligned before the call to > > __vdso_sgx_enter_enclave() (likely). Further, the push and mov are > > going to be performed by *someone* in order to call > > __vdso_sgx_enter_enclave() from C. > > > > Therefore, I'd like to propose that __vdso_sgx_enter_enclave(): > > * Preserve %rbx. > > At first glance, that looks sane. Being able to call __vdso... from C > would certainly be nice. Agreed. I think ergonomically we want __vdso...() to be called from C and the handler to be implemented in asm (optionally); without breaking the ability to call __vdso..() from asm in special cases. I think all ergonomic issues get solved by the following: * Pass a void * into the handler from C through __vdso...(). * Allow the handler to pop parameters off of the output stack without hacks. This allows the handler to pop extra arguments off the stack and write them into the memory at the void *. Then the handler can be very small and pass logic back to the caller of __vdso...(). Here's what this all means for the enclave. For maximum usability, the enclave should preserve all callee-saved registers (except %rbx, which is preserved by __vdso..()). For each ABI rule that the enclave breaks, you need logic in a handler to fix it. So if you push return params on the stack, the handler needs to undo that. This doesn't compromise the ability to treat __vsdo...() like ENCLU if you need the full power. But it does make it significantly easier to consume when you don't have special needs. So as I see it, __vdso...() should: 1. preserve %rbx 2. take leaf in %rcx 3. gain a void* stack param which is passed to the handler 4. sub/add to %rsp rather than save/restore That would make this a very usable and fast interface without sacrificing any of its current power. > > * Take the leaf as an additional stack parameter instead of passing > > it in %rax. > > Does the leaf even need to be a stack param? Wouldn't it be possible to > use %rcx as @leaf instead of @unusued? E.g. Even better! > int __vdso_sgx_enter_enclave(unsigned long rdi, unsigned long rsi, > unsigned long rdx, unsigned int leaf, > unsigned long r8, unsigned long r9, > void *tcs, struct sgx_enclave_exception *e, > sgx_enclave_exit_handler_t handler) > { > push %rbp > mov %rsp, %rbp > push %rbx > > mov %ecx, %eax > .Lenter_enclave > cmp $0x2, %eax > jb .Linvalid_leaf > cmp $0x3, %eax > ja .Linvalid_leaf > > mov 0x0x10(%rbp), %rbx > lea .Lasync_exit_pointer(%rip), %rcx > > .Lasync_exit_pointer: > .Lenclu_eenter_eresume: > enclu > > xor %eax, %eax > > .Lhandle_exit: > cmp $0, 0x20(%rbp) > jne .Linvoke_userspace_handler > > .Lout > pop %rbx > leave > ret > } > > > > Then C can call it without additional overhead. And people who need to > > "extend" the C ABI can still do so. > > >