> From: Nadav Har'El > Sent: Tuesday, May 17, 2011 3:54 AM > > This patch implements nested_vmx_vmexit(), called when the nested L2 guest > exits and we want to run its L1 parent and let it handle this exit. > > Note that this will not necessarily be called on every L2 exit. L0 may decide > to handle a particular exit on its own, without L1's involvement; In that > case, L0 will handle the exit, and resume running L2, without running L1 and > without calling nested_vmx_vmexit(). The logic for deciding whether to handle > a particular exit in L1 or in L0, i.e., whether to call nested_vmx_vmexit(), > will appear in a separate patch below. > > Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx> > +/* > + * A part of what we need to when the nested L2 guest exits and we want to > + * run its L1 parent, is to reset L1's guest state to the host state specified > + * in vmcs12. > + * This function is to be called not only on normal nested exit, but also on > + * a nested entry failure, as explained in Intel's spec, 3B.23.7 ("VM-Entry > + * Failures During or After Loading Guest State"). > + * This function should be called when the active VMCS is L1's (vmcs01). > + */ > +void load_vmcs12_host_state(struct kvm_vcpu *vcpu, struct vmcs12 > *vmcs12) > +{ > + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_EFER) > + vcpu->arch.efer = vmcs12->host_ia32_efer; > + if (vmcs12->vm_exit_controls & VM_EXIT_HOST_ADDR_SPACE_SIZE) > + vcpu->arch.efer |= (EFER_LMA | EFER_LME); > + else > + vcpu->arch.efer &= ~(EFER_LMA | EFER_LME); > + vmx_set_efer(vcpu, vcpu->arch.efer); > + > + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PAT) > + vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat); > + > + kvm_register_write(vcpu, VCPU_REGS_RSP, vmcs12->host_rsp); > + kvm_register_write(vcpu, VCPU_REGS_RIP, vmcs12->host_rip); > + /* > + * Note that calling vmx_set_cr0 is important, even if cr0 hasn't > + * actually changed, because it depends on the current state of > + * fpu_active (which may have changed). > + * Note that vmx_set_cr0 refers to efer set above. > + */ > + kvm_set_cr0(vcpu, vmcs12->host_cr0); > + /* > + * If we did fpu_activate()/fpu_deactivate() during L2's run, we need > + * to apply the same changes to L1's vmcs. We just set cr0 correctly, > + * but we also need to update cr0_guest_host_mask and > exception_bitmap. > + */ > + update_exception_bitmap(vcpu); > + vcpu->arch.cr0_guest_owned_bits = (vcpu->fpu_active ? X86_CR0_TS : 0); > + vmcs_writel(CR0_GUEST_HOST_MASK, > ~vcpu->arch.cr0_guest_owned_bits); > + > + /* > + * Note that CR4_GUEST_HOST_MASK is already set in the original > vmcs01 > + * (KVM doesn't change it)- no reason to call set_cr4_guest_host_mask(); > + */ > + vcpu->arch.cr4_guest_owned_bits = > ~vmcs_readl(CR4_GUEST_HOST_MASK); > + kvm_set_cr4(vcpu, vmcs12->host_cr4); > + > + /* shadow page tables on either EPT or shadow page tables */ > + kvm_set_cr3(vcpu, vmcs12->host_cr3); > + kvm_mmu_reset_context(vcpu); > + > + if (enable_vpid) { > + /* > + * Trivially support vpid by letting L2s share their parent > + * L1's vpid. TODO: move to a more elaborate solution, giving > + * each L2 its own vpid and exposing the vpid feature to L1. > + */ > + vmx_flush_tlb(vcpu); > + } How about SYSENTER and PERF_GLOBAL_CTRL MSRs? At least a TODO comment here make the whole load process complete. :-) Also isn't it more sane to update vmcs01's guest segment info based on vmcs12's host segment info? Though you can assume the environment in L1 doesn't change from VMLAUNCH/VMRESUME to VMEXIT handler, it's more architectural clear to load those segments fields according to L1's desire. Thanks Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html