Re: [PATCH for_v23 16/16] x86/vdso: sgx: Rework __vdso_sgx_enter_enclave() to prefer "no callback"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/9/2019 12:10 PM, Sean Christopherson wrote:
On Wed, Oct 09, 2019 at 11:00:55AM -0700, Xing, Cedric wrote:
On 10/7/2019 9:46 PM, Sean Christopherson wrote:
-	/* Align stack per x86_64 ABI. The original %rsp is saved in %rbx to be
-	 * restored after the exit handler returns. */
+
+	/* Invoke userspace's exit handler if one was provided. */
+.Lhandle_exit:
+	cmp	$0, 0x20(%rbp)
+	jne	.Linvoke_userspace_handler
+
+.Lout:
+	leave
+	.cfi_def_cfa		%rsp, 8
+	ret
+
+.Linvalid_leaf:

Please set frame pointer back to %rbp here, or stack unwinding will fail.

Sorry, coffee isn't doing it's job, what's getting crushed, and where?

The frame pointer was %rbp but you changed it to %rsp 3 lines ago. That's correct after "leave" and execution won't pass "ret". But the unwinder doesn't know. So you have to restore frame pointer after "ret", by
	.cfi_def_cfa		%rbp, 16

As you mentioned in the stack alignment case, we just can't rely on code review to catch such bugs. We need a test case to make sure all CFI directives are correct, which was also a request from Andy.

+.Lhandle_exception:
+	mov	0x18(%rbp), %rcx
+	test    %rcx, %rcx
+	je	.Lskip_exception_info

A single "jrcxz .Lskip_exception_info" is equivalent to the above 2
instructions combined.

Both implementations take a single uop on CPUs that support SGX.  IMO,
using the simpler and more common instructions is more universally
readable.

I'm not sure the processor could combine 2 instructions ("test"+"je") into just 1 uop. And "jrcxz" is also a broadly used instruction.

+	/* Push @e, u_rsp and @tcs as parameters to the callback. */
  	push	0x18(%rbp)
  	push	%rbx
  	push	0x10(%rbp)
-	/* Call *%rax via retpoline */
-	call	40f
-	/* Restore %rsp to its original value left off by the enclave from last
-	 * exit */
+
+	/* Pass the "return" value to the callback via %rcx. */
+	mov	%eax, %ecx

@e (ex_info) is almost always needed by every callback as it also serves as
the "context pointer". The return value on the other hand is insignificant
because it could be deduced from @e->EX_LEAF anyway. So I'd retain %rcx and
push %rax to the stack instead, given the purpose of this patch is to
squeeze out a bit performance.

Please take this up in patch 02/16, which actually introduced this change.

My apology but willing to pull all related discussions into a single thread.

If you adhere to the convention of "%rcx containing @e", then the code here could be
	push	%rax		// for stack alignment
	push	%rax		// return value
	push	%rbx		// u_rsp
	push	0x10(%rsp)	// tcs
				// %rcx left unchanged pointing to @e
+	/* Clear RFLAGS.DF per x86_64 ABI */
+	cld
+
+	/* Load the callback pointer to %rax and invoke it via retpoline. */
+	mov	0x20(%rbp), %rax

Per X86_64 ABI, %rsp shall be 16 bytes aligned before "call". But %rsp here
doesn't look aligned properly.

Argh, I probably botched it back in patch 02/16 too.  I'll see if I can
add a check to verify %rsp alignment in the selftest, verifying via code
inspection is bound to be error prone.

+	call	.Lretpoline



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux