[PATCH v2] KVM: arm64: Try PMD block mappings if PUD mappings are not supported

Alexandru Elisei <alexandru.elisei@xxxxxxx> · Thu, 10 Sep 2020 14:33:51 +0100

When userspace uses hugetlbfs for the VM memory, user_mem_abort() tries to
use the same block size to map the faulting IPA in stage 2. If stage 2
cannot the same block mapping because the block size doesn't fit in the
memslot or the memslot is not properly aligned, user_mem_abort() will fall
back to a page mapping, regardless of the block size. We can do better for
PUD backed hugetlbfs by checking if a PMD block mapping is supported before
deciding to use a page.

vma_pagesize is an unsigned long, use 1UL instead of 1ULL when assigning
its value.

Signed-off-by: Alexandru Elisei <alexandru.elisei@xxxxxxx>
---
Tested on a rockpro64 with 4K pages and hugetlbfs hugepagesz=1G (PUD sized
block mappings).  First test, guest RAM starts at 0x8100 0000
(memslot->base_gfn not aligned to 1GB); second test, guest RAM starts at
0x8000 0000, but is only 512 MB.  In both cases using PUD mappings is not
possible because either the memslot base address is not aligned, or the
mapping would extend beyond the memslot.

Without the changes, user_mem_abort() uses 4K pages to map the guest IPA.
With the patches, user_mem_abort() uses PMD block mappings (2MB) to map the
guest RAM, which means less TLB pressure and fewer stage 2 aborts.

Changes since v1 [1]:
- Rebased on top of Will's stage 2 page table handling rewrite, version 4
  of the series [2]. His series is missing the patch "KVM: arm64: Update
  page shift if stage 2 block mapping not supported" and there might be a
  conflict (it's straightforward to fix).

[1] https://www.spinics.net/lists/arm-kernel/msg834015.html
[2] https://www.spinics.net/lists/arm-kernel/msg835806.html

 arch/arm64/kvm/mmu.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 1041be1fafe4..39c539d4d4cb 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -776,16 +776,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	else
 		vma_shift = PAGE_SHIFT;
 
-	vma_pagesize = 1ULL << vma_shift;
 	if (logging_active ||
-	    (vma->vm_flags & VM_PFNMAP) ||
-	    !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
+	    (vma->vm_flags & VM_PFNMAP)) {
 		force_pte = true;
-		vma_pagesize = PAGE_SIZE;
+		vma_shift = PAGE_SHIFT;
+	}
+
+	if (vma_shift == PUD_SHIFT &&
+	    !fault_supports_stage2_huge_mapping(memslot, hva, PUD_SIZE))
+	       vma_shift = PMD_SHIFT;
+
+	if (vma_shift == PMD_SHIFT &&
+	    !fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
+		force_pte = true;
+		vma_shift = PAGE_SHIFT;
 	}
 
+	vma_pagesize = 1UL << vma_shift;
 	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
-		fault_ipa &= huge_page_mask(hstate_vma(vma));
+		fault_ipa &= ~(vma_pagesize - 1);
 
 	gfn = fault_ipa >> PAGE_SHIFT;
 	mmap_read_unlock(current->mm);
-- 
2.28.0

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm