On Thu, 2023-11-02 at 16:22 -0700, Sean Christopherson wrote: > On Thu, Nov 02, 2023, Maxim Levitsky wrote: > > On Tue, 2023-10-10 at 20:02 +0000, John Allen wrote: > > > @@ -3032,6 +3037,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) > > > if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP)) > > > svm_clr_intercept(svm, INTERCEPT_RDTSCP); > > > } > > > + > > > + if (kvm_caps.supported_xss) > > > + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_XSS, 1, 1); > > > > This is not just a virtualization hole. This allows the guest to set MSR_IA32_XSS > > to whatever value it wants, and thus it might allow XSAVES to access some host msrs > > that guest must not be able to access. > > > > AMD might not yet have such msrs, but on Intel side I do see various components > > like 'HDC State', 'HWP state' and such. > > The approach AMD has taken with SEV-ES+ is to have ucode context switch everything > that the guest can access. So, in theory, if/when AMD adds more XCR0/XSS-based > features, that state will also be context switched. > > Don't get me wrong, I hate this with a passion, but it's not *quite* fatally unsafe, > just horrific. > > > I understand that this is needed so that #VC handler could read this msr, and > > trying to read it will cause another #VC which is probably not allowed (I > > don't know this detail of SEV-ES) > > > > I guess #VC handler should instead use a kernel cached value of this msr > > instead, or at least KVM should only allow reads and not writes to it. > > Nope, doesn't work. In addition to automatically context switching state, SEV-ES > also encrypts the guest state, i.e. KVM *can't* correctly virtualize XSS (or XCR0) > for the guest, because KVM *can't* load the guest's desired value into hardware. > > The guest can do #VMGEXIT (a.k.a. VMMCALL) all it wants to request a certain XSS > or XCR0, and there's not a damn thing KVM can do to service the request. > Ah, I understand now. Everything makes sense, and yes, this is really ugly. Best regards, Maxim Levitsky