On Wed, Apr 03, 2024 at 08:58:50AM -0700, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > On Mon, Feb 26, 2024, isaku.yamahata@xxxxxxxxx wrote: > > From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx> > > > > Some of TDG.VP.VMCALL require device model, for example, qemu, to handle > > them on behalf of kvm kernel module. TDVMCALL_REPORT_FATAL_ERROR, > > TDVMCALL_MAP_GPA, TDVMCALL_SETUP_EVENT_NOTIFY_INTERRUPT, and > > TDVMCALL_GET_QUOTE requires user space VMM handling. > > > > Introduce new kvm exit, KVM_EXIT_TDX, and functions to setup it. Device > > model should update R10 if necessary as return value. > > Hard NAK. > > KVM needs its own ABI, under no circumstance should KVM inherit ABI directly from > the GHCI. Even worse, this doesn't even sanity check the "unknown" VMCALLs, KVM > just blindly punts *everything* to userspace. And even worse than that, KVM > already has at least one user exit that overlaps, TDVMCALL_MAP_GPA => KVM_HC_MAP_GPA_RANGE. > > If the userspace VMM wants to run an end-around on KVM and directly communicate > with the guest, e.g. via a synthetic device (a la virtio), that's totally fine, > because *KVM* is not definining any unique ABI, KVM is purely providing the > transport, e.g. emulated MMIO or PIO (and maybe not even that). IIRC, this option > even came up in the context of GET_QUOTE. > > But explicit exiting to userspace with KVM_EXIT_TDX is very different. KVM is > creating a contract with userspace that says "for TDX VMCALLs [a-z], KVM will exit > to userspace with values [a-z]". *Every* new VMCALL that's added to the GHCI will > become KVM ABI, e.g. if Intel ships a TDX module that adds a new VMALL, then KVM > will forward the exit to userspace, and userspace can then start relying on that > behavior. > > And punting all register state, decoding, etc. to userspace creates a crap ABI. > KVM effectively did this for SEV and SEV-ES by copying the PSP ABI verbatim into > KVM ioctls(), and it's a gross, ugly mess. > > Each VMCALL that KVM wants to forward needs a dedicated KVM_EXIT_<reason> and > associated struct in the exit union. Yes, it's slightly more work now, but it's > one time pain. Whereas copying all registers is endless misery for everyone > involved, e.g. *every* userspace VMM needs to decipher the registers, do sanity > checking, etc. And *every* end user needs to do the same when a debugging > inevitable failures. > > This also solves Chao's comment about XMM registers. Except for emualting Hyper-V > hypercalls, which have very explicit handling, KVM does NOT support using XMM > registers in hypercalls. Sure. I will introduce the followings. KVM_EXIT_TDX_GET_QUOTE Request a quote. KVM_EXIT_TDX_SETUP_EVENT_NOTIFY_INTERRUPT Guest tells which interrupt vector the VMM uses to notify the guest. The use case if GetQuote. It is async request. The user-space VMM uses this interrupts to notify the guest on the completion. Or guest polls it. KVM_EXIT_TDX_REPORT_FATAL_ERROR Guest panicked. This conveys extra 64 bytes in registers. Probably this should be converted to KVM_EXIT_SYSTEM_EVENT and KVM_SYSTEM_EVENT_CRASH. MapGPA is converted to KVM_HC_MAP_GPA_RANGE. -- Isaku Yamahata <isaku.yamahata@xxxxxxxxx>