On Tue, Apr 05, 2022, Brijesh Singh wrote: > Hi Sean, > > On 4/4/22 19:24, Sean Christopherson wrote: > > > > +static void snp_cleanup_vmsa(struct sev_es_save_area *vmsa) > > > +{ > > > + int err; > > > + > > > + err = snp_set_vmsa(vmsa, false); > > > > Uh, so what happens if a malicious guest does RMPADJUST to convert a VMSA page > > back to a "normal" page while the host is trying to VMRUN that VMSA? Does VMRUN > > fault? > > When SEV-SNP is enabled, the VMRUN instruction performs an additional > security checks on various memory pages. In the case of VMSA page, hardware > enforce that page is marked as "VMSA" in the RMP table. If not, VMRUN will > fail with VMEXIT_INVALID. > > After the VMRUN is successful, the VMSA page is marked IN_USE by the > hardware, any attempt to modify the RMP entries will result in FAIL_INUSE > error. The IN_USE marking is automatically cleared by the hardware after the > #VMEXIT. > > Please see the APM vol2 section 15.36.12 for additional information. Thanks! > > Can Linux refuse to support this madness and instead require the ACPI MP wakeup > > protocol being proposed/implemented for TDX? That would allow KVM to have at > > My two cents > > In the current architecture, the HV track VMSAs by their SPA and guest > controls when they are runnable. It provides flexibility to the guest, which > can add and remove the VMSA. This flexibility may come in handy to support > the kexec and reboot use cases. I understand it provides the guest flexibility, but IMO it completely inverts the separation of concerns between host and guest. The host should have control of when a vCPU is added/removed and with what state, and the guest should be able to verify/acknowledge any changes. This scheme gives the guest the bulk of the control, and doesn't even let the host verify much at all since the VMSA is opaque. That the guest can yank the rug out from the host at any time just adds to the pain. VMEXIT_INVALID isn't the end of the world, but it breaks the assumption that such errors are host bugs. To guard against such behavior, the host would have to unmap the VMSA page in order to prevent unwanted RMPADJUST, and that gets ugly fast if a VMSA can be any arbitrary guest page. Another example is the 2mb alignment erratum. Technically, the guest can't workaround the erratum with 100% certainty because there's no guarantee that the host uses the same alignment for gfns and pfns. I don't actually expect a host to use unaligned mappings, just pointing out how backwards this is. I fully realize there's basically zero chance of getting any of this changed in hardware/firmware, but I'm hoping we can concoct a software/GHCB solution to the worst issues. I don't see an way easy to address the guest getting to shove state directly into the VMSA, but the location of the VMSA gfn/pfn is a very solvable problem. E.g. the host gets full control over each vCPU's VMSA, and the host-provided VMSA is discoverable in the guest. That allows the guest to change vCPU state, e.g. for AP bringup, kexec, etc..., but gives the host the ability to protect itself without having to support arbitrary VMSA pages. E.g. the host can dynamically map/unmap the VMSA from the guest: map on fault, unmap on AP "creation", refuse to run the vCPU if its VMSA isn't in the unmap state. The VMSA pfn is fully host controlled, so there's no need for the guest to be aware of the 2mb alignment erratum. Requiring such GHCB extensions in the guest would make Linux incompatible with hypervisors that aren't updated, but IMO that's not a ridiculous ask given that it would be in the best interested of any hypervisor that isn't running a fully trusted, paravirt VMPL0. > The current approach does not depend on > ACPI; it will also come in handy to support microvm (minimalist machine type > without PCI nor ACPI support). Eh, a microvm really shouldn't need AP bringup in the first place, just run all APs from time zero and route them to where they need to be.