On Wed, May 12, 2021, Borislav Petkov wrote: > On Fri, Apr 23, 2021 at 03:58:43PM +0000, Ashish Kalra wrote: > > +static inline void notify_page_enc_status_changed(unsigned long pfn, > > + int npages, bool enc) > > +{ > > + PVOP_VCALL3(mmu.notify_page_enc_status_changed, pfn, npages, enc); > > +} > > Now the question is whether something like that is needed for TDX, and, > if so, could it be shared by both. Yes, TDX needs this same hook, but "can't" reuse the hypercall verbatime. Ditto for SEV-SNP. I wanted to squish everything into a single common hypercall, but that didn't pan out. The problem is that both TDX and SNP define their own versions of this so that any guest kernel that complies with the TDX|SNP specification will run cleanly on a hypervisor that also complies with the spec. This KVM-specific hook doesn't meet those requires because non-Linux guest support will be sketchy at best, and non-KVM hypervisor support will be non-existent. The best we can do, short of refusing to support TDX or SNP, is to make this KVM-specific hypercall compatible with TDX and SNP so that the bulk of the control logic is identical. The mechanics of actually invoking the hypercall will differ, but if done right, everything else should be reusable without modification. I had an in-depth analysis of this, but it was all off-list. Pasted below. TDX uses GPRs to communicate with the host, so it can tunnel "legacy" hypercalls from time zero. SNP could technically do the same (with a revised GHCB spec), but it'd be butt ugly. And of course trying to go that route for either TDX or SNP would run into the problem of having to coordinate the ABI for the "legacy" hypercall across all guests and hosts. So yeah, trying to remove any of the three (KVM vs. SNP vs. TDX) interfaces is sadly just wishful thinking. That being said, I do think we can reuse the KVM specific hypercall for TDX and SNP. Both will still need a {TDX,SNP}-specific GCH{I,B} protocol so that cross- vendor compatibility is guaranteed, but that shouldn't preclude a guest that is KVM enlightened from switching to the KVM specific hypercall once it can do so. More thoughts later on. > I guess a common structure could be used along the lines of what is in the > GHCB spec today, but that seems like overkill for SEV/SEV-ES, which will > only ever really do a single page range at a time (through > set_memory_encrypted() and set_memory_decrypted()). The reason for the > expanded form for SEV-SNP is that the OS can (proactively) adjust multiple > page ranges in advance. Will TDX need to do something similar? Yes, TDX needs the exact same thing. All three (SEV, SNP, and TDX) have more or less the exact same hook in the guest (Linux, obviously) kernel. > If so, the only real common piece in KVM is a function to track what pages > are shared vs private, which would only require a simple interface. It's not just KVM, it's also the relevant code in the guest kernel(s) and other hypervisors. And the MMU side of KVM will likely be able to share code, e.g. to act on the page size hint. > So for SEV/SEV-ES, a simpler hypercall interface to specify a single page > range is really all that is needed, but it must be common across > hypervisors. I think that was one Sean's points originally, we don't want > one hypercall for KVM, one for Hyper-V, one for VMware, one for Xen, etc. For the KVM defined interface (required for SEV/SEV-ES), I think it makes sense to make it a superset of the SNP and TDX protocols so that it _can_ be used in lieu of the SNP/TDX specific protocol. I don't know for sure whether or not that will actually yield better code and/or performance, but it costs us almost nothing and at least gives us the option of further optimizing the Linux+KVM combination. It probably shouldn't be a strict superset, as in practice I don't think SNP approach of having individual entries when batching multiple pages will yield the best performance. E.g. the vast majority (maybe all?) of conversions for a Linux guest will be physically contiguous and will have the same preferred page size, at which point there will be less overhead if the guest specifies a massive range as opposed to having to santize and fill a large buffer. TL;DR: I think the KVM hypercall should be something like this, so that it can be used for SNP and TDX, and possibly for other purposes, e.g. for paravirt performance enhancements or something. 8. KVM_HC_MAP_GPA_RANGE ----------------------- :Architecture: x86 :Status: active :Purpose: Request KVM to map a GPA range with the specified attributes. a0: the guest physical address of the start page a1: the number of (4kb) pages (must be contiguous in GPA space) a2: attributes where 'attributes' could be something like: bits 3:0 - preferred page size encoding 0 = 4kb, 1 = 2mb, 2 = 1gb, etc... bit 4 - plaintext = 0, encrypted = 1 bits 63:5 - reserved (must be zero)