Hi Marc, On 9/2/20 12:10 PM, Marc Zyngier wrote: > On 2020-09-02 11:59, Alexandru Elisei wrote: >> Hi, >> >> On 8/22/20 3:44 AM, Gavin Shan wrote: >>> Depending on the kernel configuration, PUD_SIZE could be equal to >>> PMD_SIZE. For example, both of them are 512MB with the following >>> kernel configuration. In this case, both PUD and PMD are folded >>> to PGD. >>> >>> CONFIG_ARM64_64K_PAGES y >>> CONFIG_ARM64_VA_BITS 42 >>> CONFIG_PGTABLE_LEVELS 2 >>> >>> With the above configuration, the stage2 PUD is used to backup the >>> 512MB huge page when the stage2 mapping is built. During the mapping, >>> the PUD and its subordinate levels of page table entries are unmapped >>> if the PUD is present and not huge page sensitive in stage2_set_pud_huge(). >>> Unfornately, the @addr isn't aligned to S2_PUD_SIZE and wrong page table >>> entries are zapped. It eventually leads to PUD's present bit can't be >>> cleared successfully and infinite loop in stage2_set_pud_huge(). >>> >>> This fixes the issue by checking with S2_{PUD, PMD}_SIZE instead of >>> {PUD, PMD}_SIZE to determine if stage2 PUD or PMD is used to back the >>> huge page. For this particular case, the stage2 PMD entry should be >>> used to backup the 512MB huge page with stage2_set_pmd_huge(). >> >> I can reproduce this on my rockpro64 using kvmtool. >> >> I see two issues here: first, PUD_SIZE = 512MB, but S2_PUD_SIZE = 4TB (checked >> using printk), and second, stage2_set_pud_huge() hangs. I'm working on >> debugging them. > > I have this as an immediate fix for the set_pud_huge hang, tested > on Seattle with 64k/42bits. > > I can't wait to see the back of this code... The problem is in stage2_set_pud_huge(), because kvm_stage2_has_pmd() returns false (CONFIG_PGTABLE_LEVELS = 2): pudp = stage2_get_pud(mmu, cache, addr); VM_BUG_ON(!pudp); old_pud = *pudp; [..] // Returns 1 because !kvm_stage2_has_pmd() if (stage2_pud_present(kvm, old_pud)) { /* * If we already have table level mapping for this block, unmap * the range for this block and retry. */ if (!stage2_pud_huge(kvm, old_pud)) { // Always true because !kvm_stage2_has_pmd() unmap_stage2_range(mmu, addr & S2_PUD_MASK, S2_PUD_SIZE); goto retry; } And we end up jumping back to retry forever. IMO, in user_mem_abort(), if PUD_SIZE == PMD_SIZE, we should try to map PMD_SIZE instead of PUD_SIZE. Maybe something like this? diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index ba00bcc0c884..178267dec511 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1886,8 +1886,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * As for PUD huge maps, we must make sure that we have at least * 3 levels, i.e, PMD is not folded. */ - if (vma_pagesize == PMD_SIZE || - (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm))) + if (vma_pagesize == PUD_SIZE && !kvm_stage2_has_pmd(kvm)) + vma_pagesize = PMD_SIZE; + + if (vma_pagesize == PUD_SIZE || vma_pagesize == PUD_SIZE) gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT; mmap_read_unlock(current->mm); Thanks, Alex _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm