On 10/9/2019 12:10 PM, Sean Christopherson wrote:
On Wed, Oct 09, 2019 at 11:00:55AM -0700, Xing, Cedric wrote:
On 10/7/2019 9:46 PM, Sean Christopherson wrote:
- /* Align stack per x86_64 ABI. The original %rsp is saved in %rbx to be
- * restored after the exit handler returns. */
+
+ /* Invoke userspace's exit handler if one was provided. */
+.Lhandle_exit:
+ cmp $0, 0x20(%rbp)
+ jne .Linvoke_userspace_handler
+
+.Lout:
+ leave
+ .cfi_def_cfa %rsp, 8
+ ret
+
+.Linvalid_leaf:
Please set frame pointer back to %rbp here, or stack unwinding will fail.
Sorry, coffee isn't doing it's job, what's getting crushed, and where?
The frame pointer was %rbp but you changed it to %rsp 3 lines ago.
That's correct after "leave" and execution won't pass "ret". But the
unwinder doesn't know. So you have to restore frame pointer after "ret", by
.cfi_def_cfa %rbp, 16
As you mentioned in the stack alignment case, we just can't rely on code
review to catch such bugs. We need a test case to make sure all CFI
directives are correct, which was also a request from Andy.
+.Lhandle_exception:
+ mov 0x18(%rbp), %rcx
+ test %rcx, %rcx
+ je .Lskip_exception_info
A single "jrcxz .Lskip_exception_info" is equivalent to the above 2
instructions combined.
Both implementations take a single uop on CPUs that support SGX. IMO,
using the simpler and more common instructions is more universally
readable.
I'm not sure the processor could combine 2 instructions ("test"+"je")
into just 1 uop. And "jrcxz" is also a broadly used instruction.
+ /* Push @e, u_rsp and @tcs as parameters to the callback. */
push 0x18(%rbp)
push %rbx
push 0x10(%rbp)
- /* Call *%rax via retpoline */
- call 40f
- /* Restore %rsp to its original value left off by the enclave from last
- * exit */
+
+ /* Pass the "return" value to the callback via %rcx. */
+ mov %eax, %ecx
@e (ex_info) is almost always needed by every callback as it also serves as
the "context pointer". The return value on the other hand is insignificant
because it could be deduced from @e->EX_LEAF anyway. So I'd retain %rcx and
push %rax to the stack instead, given the purpose of this patch is to
squeeze out a bit performance.
Please take this up in patch 02/16, which actually introduced this change.
My apology but willing to pull all related discussions into a single thread.
If you adhere to the convention of "%rcx containing @e", then the code
here could be
push %rax // for stack alignment
push %rax // return value
push %rbx // u_rsp
push 0x10(%rsp) // tcs
// %rcx left unchanged pointing to @e
+ /* Clear RFLAGS.DF per x86_64 ABI */
+ cld
+
+ /* Load the callback pointer to %rax and invoke it via retpoline. */
+ mov 0x20(%rbp), %rax
Per X86_64 ABI, %rsp shall be 16 bytes aligned before "call". But %rsp here
doesn't look aligned properly.
Argh, I probably botched it back in patch 02/16 too. I'll see if I can
add a check to verify %rsp alignment in the selftest, verifying via code
inspection is bound to be error prone.
+ call .Lretpoline