On 11/10/2011 04:40 PM, Nadav Har'El wrote: > On Thu, Nov 10, 2011, Avi Kivity wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT": > > > +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) > > > +{ > > > + int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu); > >... > > > + vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; > >... > > > > kvm_init_shadow_mmu() will cause ->page_fault to be set to something > > like paging64_page_fault(), which is geared to reading EPT ptes. How > > does this work? s/EPT/ia32/ > > Hi, > > I'm afraid I didn't understand the problem. > > Nested EPT's merging of two EPT tables (EPT01 and EPT12) works just like > normal shadow page tables' merging of two CR3s (host cr3 and guest cr3): > > When L0 receives a "page fault" from L2 (actually an EPT violation - real > guest #PF don't cause exits), L0 first looks it up in the shadowed table, > which is basically EPT12. If the address is there, L0 handles the fault itself > (updating the shadow EPT table, EPT02 using the normal shadow pte building > code). But if the address wasn't in the shadowed page table (EPT12), > mmu->inject_page_fault() is called, which in our case actually causes L1 to > get an EPT-violation (not #PF - see kvm_propagate_fault()). > > Please note that all this logic is shared with the existing nested NPT > code (which itself shared most of the code with the preexisting shadow > page tables code). All this code sharing makes it really difficult to > understand at first glance why the code is really working, but once you > understood why one of these cases works, the others work similarly. > And it does in fact work - in typical cases which I tried, at least. > > If you still think I'm missing something, I won't be entirely surprised > ( :-) ), so let me know. This is all correct, but the code in question parses the EPT12 table using the ia32 page table format. They're sufficiently similar so that it works, but it isn't correct. Bit 0: EPT readable, ia32 present Bit 1: Writable; ia32 meaning dependent on cr0.wp Bit 2: EPT executable, ia32 user (so, this implementation will interpret a non-executable EPT mapping, if someone could find a use for it, as a L2 kernel only mapping) Bits 3-5: EPT memory type, ia32 PWT/PCD (similar but different), Accessed bit Bit 6: EPT Ignore PAT, ia32 dirty Bit 7: EPT ignored, ia32 PAT Bit 8: EPT ignored, ia32 global Bit 63: EPT ignored, ia32 NX walk_addr() will also write to bits 6/7, which the L1 won't expect. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html