Hello Sean, On Wed, May 12, 2021 at 03:51:10PM +0000, Sean Christopherson wrote: > On Wed, May 12, 2021, Borislav Petkov wrote: > > On Fri, Apr 23, 2021 at 03:58:43PM +0000, Ashish Kalra wrote: > > > +static inline void notify_page_enc_status_changed(unsigned long pfn, > > > + int npages, bool enc) > > > +{ > > > + PVOP_VCALL3(mmu.notify_page_enc_status_changed, pfn, npages, enc); > > > +} > > > > Now the question is whether something like that is needed for TDX, and, > > if so, could it be shared by both. > > Yes, TDX needs this same hook, but "can't" reuse the hypercall verbatime. Ditto > for SEV-SNP. I wanted to squish everything into a single common hypercall, but > that didn't pan out. > > The problem is that both TDX and SNP define their own versions of this so that > any guest kernel that complies with the TDX|SNP specification will run cleanly > on a hypervisor that also complies with the spec. This KVM-specific hook doesn't > meet those requires because non-Linux guest support will be sketchy at best, and > non-KVM hypervisor support will be non-existent. > > The best we can do, short of refusing to support TDX or SNP, is to make this > KVM-specific hypercall compatible with TDX and SNP so that the bulk of the > control logic is identical. The mechanics of actually invoking the hypercall > will differ, but if done right, everything else should be reusable without > modification. > > I had an in-depth analysis of this, but it was all off-list. Pasted below. > > TDX uses GPRs to communicate with the host, so it can tunnel "legacy" hypercalls > from time zero. SNP could technically do the same (with a revised GHCB spec), > but it'd be butt ugly. And of course trying to go that route for either TDX or > SNP would run into the problem of having to coordinate the ABI for the "legacy" > hypercall across all guests and hosts. So yeah, trying to remove any of the > three (KVM vs. SNP vs. TDX) interfaces is sadly just wishful thinking. > > That being said, I do think we can reuse the KVM specific hypercall for TDX and > SNP. Both will still need a {TDX,SNP}-specific GCH{I,B} protocol so that cross- > vendor compatibility is guaranteed, but that shouldn't preclude a guest that is > KVM enlightened from switching to the KVM specific hypercall once it can do so. > More thoughts later on. > > > I guess a common structure could be used along the lines of what is in the > > GHCB spec today, but that seems like overkill for SEV/SEV-ES, which will > > only ever really do a single page range at a time (through > > set_memory_encrypted() and set_memory_decrypted()). The reason for the > > expanded form for SEV-SNP is that the OS can (proactively) adjust multiple > > page ranges in advance. Will TDX need to do something similar? > > Yes, TDX needs the exact same thing. All three (SEV, SNP, and TDX) have more or > less the exact same hook in the guest (Linux, obviously) kernel. > > > If so, the only real common piece in KVM is a function to track what pages > > are shared vs private, which would only require a simple interface. > > It's not just KVM, it's also the relevant code in the guest kernel(s) and other > hypervisors. And the MMU side of KVM will likely be able to share code, e.g. to > act on the page size hint. > > > So for SEV/SEV-ES, a simpler hypercall interface to specify a single page > > range is really all that is needed, but it must be common across > > hypervisors. I think that was one Sean's points originally, we don't want > > one hypercall for KVM, one for Hyper-V, one for VMware, one for Xen, etc. > > For the KVM defined interface (required for SEV/SEV-ES), I think it makes sense > to make it a superset of the SNP and TDX protocols so that it _can_ be used in > lieu of the SNP/TDX specific protocol. I don't know for sure whether or not > that will actually yield better code and/or performance, but it costs us almost > nothing and at least gives us the option of further optimizing the Linux+KVM > combination. > > It probably shouldn't be a strict superset, as in practice I don't think SNP > approach of having individual entries when batching multiple pages will yield > the best performance. E.g. the vast majority (maybe all?) of conversions for a > Linux guest will be physically contiguous and will have the same preferred page > size, at which point there will be less overhead if the guest specifies a > massive range as opposed to having to santize and fill a large buffer. > > TL;DR: I think the KVM hypercall should be something like this, so that it can > be used for SNP and TDX, and possibly for other purposes, e.g. for paravirt > performance enhancements or something. > > 8. KVM_HC_MAP_GPA_RANGE > ----------------------- > :Architecture: x86 > :Status: active > :Purpose: Request KVM to map a GPA range with the specified attributes. > > a0: the guest physical address of the start page > a1: the number of (4kb) pages (must be contiguous in GPA space) > a2: attributes > > where 'attributes' could be something like: > > bits 3:0 - preferred page size encoding 0 = 4kb, 1 = 2mb, 2 = 1gb, etc... > bit 4 - plaintext = 0, encrypted = 1 > bits 63:5 - reserved (must be zero) > Ok. Will modify page encryption status hypercall to be compatible with the above defined interface. Thanks, Ashish