On Tue, Oct 10, 2017 at 6:15 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > On 10/10/2017 17:59, Ladi Prosek wrote: >> On Tue, Oct 10, 2017 at 4:56 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >>> On 10/10/2017 14:17, Ladi Prosek wrote: >>>> + /* Temporarily set the SMM flag to access the SMM state-save area */ >>>> + vcpu->arch.hflags |= HF_SMM_MASK; >>>> + r = kvm_vcpu_read_guest(vcpu, smbase + 0xfed8, &svm_state_save, >>>> + sizeof(svm_state_save)); >>>> + vcpu->arch.hflags &= ~HF_SMM_MASK; >>>> + if (r) >>> >>> Isn't the flag still set here? You have: >>> >>> + if (ctxt->ops->post_leave_smm(ctxt, smbase, &reload_state)) >>> + return X86EMUL_UNHANDLEABLE; >>> + >>> + if (reload_state) { >>> + /* >>> + * post_leave_smm() made changes to the vCPU (e.g. entered >>> + * guest mode) and is asking us to load the SMM state-save >>> + * area again. >>> + */ >>> + ret = rsm_load_state(ctxt, smbase); >>> + if (ret != X86EMUL_CONTINUE) { >>> + /* FIXME: should triple fault */ >>> + return X86EMUL_UNHANDLEABLE; >>> + } >>> + } >>> + >>> if ((ctxt->ops->get_hflags(ctxt) & X86EMUL_SMM_INSIDE_NMI_MASK) == 0) >>> ctxt->ops->set_nmi_mask(ctxt, false); >>> >>> ctxt->ops->set_hflags(ctxt, ctxt->ops->get_hflags(ctxt) & >>> ~(X86EMUL_SMM_INSIDE_NMI_MASK | X86EMUL_SMM_MASK)); >>> >>> On the other hand you need to turn HF_SMM_MASK off and back on >>> around nested_svm_map/enter_svm_guest_mode, so that the VMCB is >>> not read from SMRAM. >> >> Ah, yes, that's a fallout from the previous arrangement where >> post_leave_smm really ran after everything else. Sorry for the mix-up. >> >>> And the mandatory stupid question: why is the first state load >>> needed at all? Could VMX's post_leave_smm callback (which would >>> really become a pre_leave_smm callback) use load_vmcs12_host_state >>> instead. SVM likewise could extract load_from_hsave_area from >>> nested_svm_vmexit, and pre-load the SMM state-save area data in >>> pre_leave_smm. >> >> I swear I've tried load_vmcs12_host_state and spent way too much time >> trying to figure out why it didn't work. I'll sleep on it and try >> again tomorrow :) >> >>> (Yes, load_vmcs12_host_state is overkill, but >>> it's already there and it already knows how to do things like >>> entering protected mode etc.). >> >> And that I'm not quite sure about. If I remember correctly I had to do >> a bit of prep work before I could call load_vmcs12_host_state -- is it >> really supposed to be able to enter protected from real? > > Yes, if L2 guest is running in real mode (with unrestricted_guest=1) > then load_vmcs12_host_state would enter protected mode. It avoids all > the hoops in emulate.c because it can call vmx_set_cr0/cr4 directly and > skip all the checks. Thanks, load_vmcs12_host_state() has a bug. It doesn't write to GUEST_IDTR_LIMIT and GUEST_GDTR_LIMIT and the former is causing grief in our case because it's set to 0 in enter_smm(): /* Undocumented: IDT limit is set to zero on entry to SMM. */ dt.address = dt.size = 0; kvm_x86_ops->set_idt(vcpu, &dt); Per Intel SDM 27.5.2 Loading Host Segment and Descriptor-Table Registers: "The GDTR and IDTR limits are each set to FFFFH." So this is what the first state load was doing for me, it just restored IDTR limit to a good value. Now after fixing load_vmcs12_host_state(), it looks like the return-from-SMM-to-L2 works without doing anything special before enter_vmx_non_root_mode (Intel) or enter_svm_guest_mode (AMD). In particular I don't need to call load_vmcs12_host_state(). To recap, this is what I have in em_rsm() now: 1. Go back to real mode - existing code 2. enter_vmx_non_root_mode() / enter_svm_guest_mode() 3. rsm_load_state_* - existing code Am I just getting lucky or is there a reason why we must have loaded host state before switching to guest mode? I understand that it would be cleaner with the intermediate host state load as going directly from real to guest is kind of unnatural. But it works and saves a bunch of cycles. Thanks! > Maybe today's patch "[PATCH] KVM: nVMX: fix guest CR4 loading when > emulating L2 to L1 exit" can help, actually. SVM has been using > svm_set_cr4 forever, but until that patch VMX was using kvm_set_cr4 > incorrectly. > > This also suggests that placing the "VMXE && SMM" check in kvm_set_cr4 > would bypass some of the issues you're having. > > Paolo > >>> Instead, post_leave_smm would run with the SMM flag clear, so >>> hopefully you wouldn't need games with HF_SMM_MASK at all (except >>> for the reserved VMXE flag---let's leave that out for now and >>> concentrate on the right way to do L2 state save restore...). >> >> Sounds good, thanks! >> >>> Paolo >