+Min, can you comment? 3a3b12cbda ("UefiCpuPkg/MtrrLib: MtrrLibIsMtrrSupported always return FALSE in TD-Guest") turned out to be problematic in practice. Full thread: https://lore.kernel.org/kvm/20250201005048.657470-1-seanjc@xxxxxxxxxx/ On Mon, 2025-02-03 at 16:27 -0800, Sean Christopherson wrote: > On Mon, Feb 03, 2025, Rick P Edgecombe wrote: > > On Mon, 2025-02-03 at 12:33 -0800, Sean Christopherson wrote: > > > > Since there is no upstream KVM TDX support yet, why isn't it an option to > > > > still > > > > revert the EDKII commit too? It was a relatively recent change. > > > > > > I'm fine with that route too, but it too is a band-aid. Relying on the > > > *untrusted* > > > hypervisor to essentially communicate memory maps is not a winning strategy. > > > > > > > To me it seems that the normal KVM MTRR support is not ideal, because it is > > > > still lying about what it is doing. For example, in the past there was an > > > > attempt to use UC to prevent speculative execution accesses to sensitive > > > > data. > > > > The KVM MTRR support only happens to work with existing guests, but not all > > > > possible MTRR usages. > > > > > > > > Since diverging from the architecture creates loose ends like that, we could > > > > instead define some other way for EDKII to communicate the ranges to the > > > > kernel. > > > > Like some simple KVM PV MSRs that are for communication only, and not > > > > > > Hard "no" to any PV solution. This isn't KVM specific, and as above, bouncing > > > through the hypervisor to communicate information within the guest is asinine, > > > especially for CoCo VMs. > > > > Hmm, right. > > > > So the other options could be: > > > > 1. Some TDX module feature to hold the ranges: > > - Con: Not shared with AMD > > > > 2. Re-use MTRRs for the communication, revert changes in guest and edk2: > > Thinking more about how EDK2 is consumed downstream, I think reverting the EDK2 > changes is necessary regardless of what happens in the kernel. Or at the least, > somehow communicate to EDK2 users that ingesting those changes is a bad idea > unless the kernel has also been updated. > > AFAIK, Bring Your Own Firmware[*] isn't widely adopted, which means that the CSP > is shipping the firmware. And shipping OVMF/EDK2 with the "ignores MTRRs" code > will cause problems for guests without commit 8e690b817e38 ("x86/kvm: Override > default caching mode for SEV-SNP and TDX"). Since the host doesn't control the > guest kernel, there's no way to know if deploying those EDK2 changes is safe. > > [*] https://kvm-forum.qemu.org/2024/BYOF_-_KVM_Forum_2024_iWTioIP.pdf > Hmm. Since there is no upstream TDX KVM support, for it's part, I guess KVM should still get a chance to define a cleaner solution (if there actually was a cleaner solution). But yea, it would mean only components from after the solution was settled could be used together for a fully working stack. And it should probably be called out somehow. Maybe could be in the KVM TDX docs or something. Still seems like a thing to avoid if possible. > > - Con: Creating more half support, when it's technically not required > > - Con: Still bouncing through the hypervisor > > I assume by "Re-use MTRRs for the communication" you also mean updating the guest > to address the "everything is UC!" flaw, otherwise another con is: > > - Con: Doesn't address the performance issue with TDX guests "using" UC > memory by default (unless there's yet more enabled). Hmm. This is quite the tangled corner. > > Presumably that can be accomplished by simply skipping the CR0.CD toggling, and > doing MTRR stuff as nonrmal? I'll have to get back to you on this one. Kirill probably could give a better answer, but likely will not be able to follow up on this thread until next week. > > > - Pro: Design and code is clear > > > > 3. Create some new architectural definition, like a bit that means "MTRRs don't > > actually work: > > - Con: Takes a long time, need to get agreement > > - Con: Still bouncing through the hypervisor > > Not for KVM guests. As I laid out in my bug report, it's safe to assume MTRRs > don't actually affect the memory type when running under KVM. > > FWIW, PAT doesn't "work" on most KVM Intel setups either, because of misguided > KVM code that resulted in "Ignore Guest PAT" being set in all EPTEs for the > overwhelming majority of guests. That's not desirable long term because it > prevents the guest from using WC (via PAT) in situations where doing so is needed > for performance and/or correctness. > > > - Pro: More pure solution > > MTRRs "not working" is a red herring. The problem isn't that MTRRs don't work, > it's that the kernel is (somewhat unknowingly) using MTRRs as a crutch to get the > desired memtype for devices. E.g. for emulated MMIO, MTRRs _can't_ be virtualized, > because there's never a valid mapping, i.e. there is no physical memory and thus > no memtype. In other words, under KVM guests (and possibly other hypervisors), > MTRRs end up being nothing more than a communication channel between guest firmware > and the kernel. Yea. > > The gap for CoCo VMs is that using MTRRs is undesirable because they are controlled > by the untrusted host. But that's largely a future problem, unless someone has a > clever way to fix the kernel mess. > > Yea, I wondered about that too. I imagine the thinking was that since it is only controlling shared memory, it can be untrusted. And I guess the solution in this patchset is hypothetically a bit more locked down in that respect.