On Thu, Apr 25, 2024, Rick P Edgecombe wrote: > On Thu, 2024-04-25 at 23:09 +0800, Xiaoyao Li wrote: > > > The idea is that TDX module could add the capability to configure these > > > bits as well, so that TDs could match normal VMs for cases where there is > > > a desire for the guests MAXPA to be smaller than the hosts. The > > > requirements would be, > > > roughly: > > > - The VMM specifies the 0x80000008.EAX[23:16] when creating a TD. > > > - The TDX module does sanity checking. > > > - The 0x80000008.EAX[23:16] field is used to communicate the max > > > addressable > > > GPA to the guest. It will be used by the guest firmware to make sure > > > resources like PCI bars are mapped into the addressable GPA. > > > - If the guest attempts to access memory beyond the max addressable GPA, > > > then > > > the TDX module generates EPT violation to the VMM. For the VMM, this case > > > means that the guest attempted to access "invalid" (I/O) memory. > > > - The VMM will be expected to terminate the TD guest. The VMM may send > > > a notification, but the TDX module doesn't necessarily need to know how. > > > > This is not the same as how it works for normal (non-TDX) VMs. > > > > For normal VMs, when userspace configures a smaller one than what > > hardware EPT/NPT supports, it doesn't cause any issue if guest accesses > > GPA beyond [23:16] but within hardware EPT/NTP capability. > > > > It's more a hint to guest that KVM doesn't enforce the semantics of it. > > However, for TDX case, you are proposing to make it a hard rule. > > If we limit ourselves to worrying about valid configurations, Define "valid configurations". > accessing a GPA beyond [23:16] is similar to accessing a GPA with no memslot. No, it's not. A GPA without a memslot has *very* well-defined semantics in KVM, and KVM can provide those semantics for all guest-legal GPAs regardless of hardware EPT/NPT support. > Like you say, [23:16] is a hint, so there is really no change from KVM's > perspective. It behaves like normal based on the [7:0] MAXPA. > > What do you think should happen in the case a TD accesses a GPA with no memslot? Synthesize a #VE into the guest. The GPA isn't a violation of the "real" MAXPHYADDR, so killing the guest isn't warranted. And that also means the VMM could legitimately want to put emulated MMIO above the max addressable GPA. Synthesizing a #VE is also aligned with KVM's non-memslot behavior for TDX (configured to trigger #VE). And most importantly, as you note above, the VMM *can't* resolve the problem. On the other hand, the guest *might* be able to resolve the issue, e.g. it could request MMIO, which may or may not succeed. Even if the guest panics, that's far better than it being terminated by the host as it gives the guest a chance to capture what led to the panic/crash. The only downside is that the VMM doesn't have a chance to "bless" the #VE, but since the VMM literally cannot handle the "bad" access in any other than killing the guest, I don't see that as a major problem. > KVM/QEMU don't have a lot of options to recover. So are the differences here > just the existing differences between normal VMs and TDX?