On Thu, Oct 4, 2012 at 10:23 PM, Min-gyu Kim <mingyu84.kim@xxxxxxxxxxx> wrote: > > >> -----Original Message----- >> From: kvm-owner@xxxxxxxxxxxxxxx [mailto:kvm-owner@xxxxxxxxxxxxxxx] On >> Behalf Of Christoffer Dall >> Sent: Monday, October 01, 2012 6:11 PM >> To: kvm@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; >> kvmarm@xxxxxxxxxxxxxxxxxxxxx >> Cc: Marc Zyngier >> Subject: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup >> >> +static void stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache >> *cache, >> + phys_addr_t addr, const pte_t *new_pte) { >> + pgd_t *pgd; >> + pud_t *pud; >> + pmd_t *pmd; >> + pte_t *pte, old_pte; >> + >> + /* Create 2nd stage page table mapping - Level 1 */ >> + pgd = kvm->arch.pgd + pgd_index(addr); >> + pud = pud_offset(pgd, addr); >> + if (pud_none(*pud)) { >> + if (!cache) >> + return; /* ignore calls from kvm_set_spte_hva */ >> + pmd = mmu_memory_cache_alloc(cache); >> + pud_populate(NULL, pud, pmd); >> + pmd += pmd_index(addr); >> + get_page(virt_to_page(pud)); >> + } else >> + pmd = pmd_offset(pud, addr); >> + >> + /* Create 2nd stage page table mapping - Level 2 */ >> + if (pmd_none(*pmd)) { >> + if (!cache) >> + return; /* ignore calls from kvm_set_spte_hva */ >> + pte = mmu_memory_cache_alloc(cache); >> + clean_pte_table(pte); >> + pmd_populate_kernel(NULL, pmd, pte); >> + pte += pte_index(addr); >> + get_page(virt_to_page(pmd)); >> + } else >> + pte = pte_offset_kernel(pmd, addr); >> + >> + /* Create 2nd stage page table mapping - Level 3 */ >> + old_pte = *pte; >> + set_pte_ext(pte, *new_pte, 0); >> + if (pte_present(old_pte)) >> + __kvm_tlb_flush_vmid(kvm); >> + else >> + get_page(virt_to_page(pte)); >> +} > > > I'm not sure about the 3-level page table, but isn't it necessary to > clean the page table for 2nd level? > There are two mmu_memory_cache_alloc calls. One has following clean_pte_table > and the other doesn't have. hmm, it probably is - I couldn't really find the common case where this is done in the kernel normally (except for some custom loop in ioremap and idmap), but I added this fix: diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 5394a52..f11ba27f 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -430,6 +430,7 @@ static void stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, if (!cache) return; /* ignore calls from kvm_set_spte_hva */ pmd = mmu_memory_cache_alloc(cache); + clean_dcache_area(pmd, PTRS_PER_PMD * sizeof(pmd_t)); pud_populate(NULL, pud, pmd); pmd += pmd_index(addr); get_page(virt_to_page(pud)); > > And why do you ignore calls from kvm_set_spte_hva? It is supposed to happen when > host moves the page, right? Then you ignore the case because it can be handled > later when fault actually happens? Is there any other reason that I miss? > kvm_set_spte_hva tells us that a page at some IPA is going to be backed by another physical page, which means we must adjust the stage 2 mapping. However, if we don't have that page mapped in the stage 2 page table, we don't need to do anything, and certainly don't want to start allocating unnecessary level2 and level3 page tables. Thanks! -Christoffer -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html