On Fri, Apr 19, 2024, kirill.shutemov@xxxxxxxxxxxxxxx wrote: > On Thu, Apr 18, 2024 at 11:26:11AM -0700, Sean Christopherson wrote: > > On Thu, Apr 18, 2024, kirill.shutemov@xxxxxxxxxxxxxxx wrote: > > I think having one trampoline makes sense, e.g. to minimize the probability of > > leaking register state to the VMM. The part that I don't like, and which generates > > awful code, is shoving register state into a memory structure. > > I don't think we can get away with single trampoline. We have outliers. Yeah, I should have said "one trampoline for all of the not insane APIs" :-) > See TDG.VP.VMCALL<ReportFatalError> that uses pretty much all registers as > input. And I hope we wouldn't need TDG.VP.VMCALL<Instruction.PCONFIG> any > time soon. It uses all possible output registers. XMM: just say no. > But I guess we can make a *few* wrappers that covers all needed cases. Yeah. I suspect two will suffice. One for the calls that say at or below four inputs, and one for the fat ones like ReportFatalError that use everything under the sun. For the latter one, marshalling data into registers via an in-memory structure makes, especially if we move away from tdx_module_args, e.g. this is quite clean and reasonable: union { /* Define register order according to the GHCI */ struct { u64 r14, r15, rbx, rdi, rsi, r8, r9, rdx; }; char str[64]; } message; /* VMM assumes '\0' in byte 65, if the message took all 64 bytes */ strtomem_pad(message.str, msg, '\0'); /* * This hypercall should never return and it is not safe * to keep the guest running. Call it forever if it * happens to return. */ while (1) tdx_fat_tdvmcall(&message); > > Weird? Yeah. But at least we one need to document one weird calling convention, > > and the ugliness is contained to three macros and a small assembly function. > > Okay, the approach is worth exploring. I can work on it. > > You focuses here on TDVMCALL. What is your take on the rest of TDCALL? Not sure, haven't looked at them recently. At a glance, something similar? The use of high registers instead of RDI and RSI is damn annoying :-/ Hmm, but it looks like there are enough simple TDCALLs that stay away from high registers that open coding inline asm() is a viable (best?) approach. RAX being the leaf and the return value is annoying, so maybe a simple macro to make it easier to deal with that? It won't allow for perfectly optimal code generation, but forcing a MOV for a TDCALL isn't going to affect performance, and it will make reading the code dead simple. #define tdcall_leaf(leaf) "mov $" leaf ", %%eax\n\t.byte 0x66,0x0f,0x01,0xcc\n\t" Then PAGE_ACCEPT is simply: asm(tdcall_leaf(TDG_MEM_PAGE_ACCEPT) : "=a"(ret), : "c"(start | page_size)); if (ret) return 0; And even the meanies that use R8 are reasonably easy to handle: asm("xor %%r8d, %%r8d\n\t" tdcall_leaf(TDG_MR_REPORT) : "=a"(ret) : "c"(__pa(report)), "d"(__pa(data))); and (though using names for the outputs, I just can't remember the syntax as I'm typing this :-/) asm(tdcall_leaf(TDG_VM_RD) "mov %%r8, %0\n\t" : "=r"(value), "=a"(ret) : "c"(0), "d"(no_idea_what_this_is)); if (ret) <cry> return value; Or, if you wanted to get fancy, use asm_goto_output() to bail on an error so that you could optimize for using RAX as the return value, because *that's going to make all the difference :-D