On 2013-08-06 17:53, Gleb Natapov wrote: > On Tue, Aug 06, 2013 at 05:48:54PM +0200, Jan Kiszka wrote: >> On 2013-08-06 17:04, Zhang, Yang Z wrote: >>> Gleb Natapov wrote on 2013-08-06: >>>> On Tue, Aug 06, 2013 at 02:12:51PM +0000, Zhang, Yang Z wrote: >>>>> Gleb Natapov wrote on 2013-08-06: >>>>>> On Tue, Aug 06, 2013 at 11:44:41AM +0000, Zhang, Yang Z wrote: >>>>>>> Gleb Natapov wrote on 2013-08-06: >>>>>>>> On Tue, Aug 06, 2013 at 10:39:59AM +0200, Jan Kiszka wrote: >>>>>>>>> From: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> >>>>>>>>> >>>>>>>>> If nested EPT is enabled, the L2 guest may change CR3 without any >>>>>>>>> exits. We therefore have to read the current value from the VMCS >>>>>>>>> when switching to L1. However, if paging wasn't enabled, L0 tracks >>>>>>>>> L2's CR3, and GUEST_CR3 rather contains the real-mode identity map. >>>>>>>>> So we need to retrieve CR3 from the architectural state after >>>>>>>>> conditionally updating it - and this is what kvm_read_cr3 does. >>>>>>>>> >>>>>>>> I have a headache from trying to think about it already, but >>>>>>>> shouldn't >>>>>>>> L1 be the one who setups identity map for L2? I traced what >>>>>>>> vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) return here and do not >>>>>>>> see >>>>>>> Here is my understanding: >>>>>>> In vmx_set_cr3(), if enabled ept, it will check whether target >>>>>>> vcpu is enabling >>>>>> paging. When L2 running in real mode, then target vcpu is not >>>>>> enabling paging and it will use L0's identity map for L2. If you >>>>>> read GUEST_CR3 from VMCS, then you may get the L2's identity map >>>>>> not >>>> L1's. >>>>>>> >>>>>> Yes, but why it makes sense to use L0 identity map for L2? I didn't >>>>>> see different vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) values because >>>>>> L0 and L1 use the same identity map address. When I changed identity >>>>>> address L1 configures vmcs_read64(GUEST_CR3)/kvm_read_cr3(vcpu) are >>>>>> indeed different, but the real CR3 L2 uses points to L0 identity map. >>>>>> If I zero L1 identity map page L2 still works. >>>>>> >>>>> If L2 in real mode, then L2PA == L1PA. So L0's identity map also works >>>>> if L2 is in real mode. >>>>> >>>> That not the point. It may work accidentally for kvm on kvm, but what >>>> if other hypervisor plays different tricks and builds different ident map for its guest? >>> Yes, if other hypervisor doesn't build the 1:1 mapping for its guest, it will fail to work. But I cannot imagine what kind of hypervisor will do this and what the purpose is. >>> Anyway, current logic is definitely wrong. It should use L1's identity map instead L0's. >> >> So something like this is rather needed? >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 44494ed..60a3644 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -3375,8 +3375,10 @@ static void vmx_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3) >> if (enable_ept) { >> eptp = construct_eptp(cr3); >> vmcs_write64(EPT_POINTER, eptp); >> - guest_cr3 = is_paging(vcpu) ? kvm_read_cr3(vcpu) : >> - vcpu->kvm->arch.ept_identity_map_addr; >> + if (is_paging(vcpu) || is_guest_mode(vcpu)) >> + guest_cr3 = kvm_read_cr3(vcpu) : >> + else >> + guest_cr3 = vcpu->kvm->arch.ept_identity_map_addr; >> ept_load_pdptrs(vcpu); >> } >> > That what I am thinking, will think about it some more tomorrow. OK, and I'll feed it into a local test. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html