On Thu, 15 Apr 2021 15:08:09 +0100, Keqian Zhu <zhukeqian1@xxxxxxxxxx> wrote: > > Hi Marc, > > On 2021/4/15 22:03, Keqian Zhu wrote: > > The MMIO region of a device maybe huge (GB level), try to use > > block mapping in stage2 to speedup both map and unmap. > > > > Compared to normal memory mapping, we should consider two more > > points when try block mapping for MMIO region: > > > > 1. For normal memory mapping, the PA(host physical address) and > > HVA have same alignment within PUD_SIZE or PMD_SIZE when we use > > the HVA to request hugepage, so we don't need to consider PA > > alignment when verifing block mapping. But for device memory > > mapping, the PA and HVA may have different alignment. > > > > 2. For normal memory mapping, we are sure hugepage size properly > > fit into vma, so we don't check whether the mapping size exceeds > > the boundary of vma. But for device memory mapping, we should pay > > attention to this. > > > > This adds get_vma_page_shift() to get page shift for both normal > > memory and device MMIO region, and check these two points when > > selecting block mapping size for MMIO region. > > > > Signed-off-by: Keqian Zhu <zhukeqian1@xxxxxxxxxx> > > --- > > arch/arm64/kvm/mmu.c | 61 ++++++++++++++++++++++++++++++++++++-------- > > 1 file changed, 51 insertions(+), 10 deletions(-) > > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > > index c59af5ca01b0..5a1cc7751e6d 100644 > > --- a/arch/arm64/kvm/mmu.c > > +++ b/arch/arm64/kvm/mmu.c > > @@ -738,6 +738,35 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > > return PAGE_SIZE; > > } > > > > +static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva) > > +{ > > + unsigned long pa; > > + > > + if (is_vm_hugetlb_page(vma) && !(vma->vm_flags & VM_PFNMAP)) > > + return huge_page_shift(hstate_vma(vma)); > > + > > + if (!(vma->vm_flags & VM_PFNMAP)) > > + return PAGE_SHIFT; > > + > > + VM_BUG_ON(is_vm_hugetlb_page(vma)); > > + > > + pa = (vma->vm_pgoff << PAGE_SHIFT) + (hva - vma->vm_start); > > + > > +#ifndef __PAGETABLE_PMD_FOLDED > > + if ((hva & (PUD_SIZE - 1)) == (pa & (PUD_SIZE - 1)) && > > + ALIGN_DOWN(hva, PUD_SIZE) >= vma->vm_start && > > + ALIGN(hva, PUD_SIZE) <= vma->vm_end) > > + return PUD_SHIFT; > > +#endif > > + > > + if ((hva & (PMD_SIZE - 1)) == (pa & (PMD_SIZE - 1)) && > > + ALIGN_DOWN(hva, PMD_SIZE) >= vma->vm_start && > > + ALIGN(hva, PMD_SIZE) <= vma->vm_end) > > + return PMD_SHIFT; > > + > > + return PAGE_SHIFT; > > +} > > + > > static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > struct kvm_memory_slot *memslot, unsigned long hva, > > unsigned long fault_status) > > @@ -769,7 +798,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > return -EFAULT; > > } > > > > - /* Let's check if we will get back a huge page backed by hugetlbfs */ > > + /* > > + * Let's check if we will get back a huge page backed by hugetlbfs, or > > + * get block mapping for device MMIO region. > > + */ > > mmap_read_lock(current->mm); > > vma = find_vma_intersection(current->mm, hva, hva + 1); > > if (unlikely(!vma)) { > > @@ -778,15 +810,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > return -EFAULT; > > } > > > > - if (is_vm_hugetlb_page(vma)) > > - vma_shift = huge_page_shift(hstate_vma(vma)); > > - else > > - vma_shift = PAGE_SHIFT; > > - > > - if (logging_active || > > - (vma->vm_flags & VM_PFNMAP)) { > > + /* > > + * logging_active is guaranteed to never be true for VM_PFNMAP > > + * memslots. > > + */ > > + if (logging_active) { > > force_pte = true; > > vma_shift = PAGE_SHIFT; > > + } else { > > + vma_shift = get_vma_page_shift(vma, hva); > > } > I use a if/else manner in v4, please check that. Thanks very much! That's fine. However, it is getting a bit late for 5.13, and we don't have much time to left it simmer in -next. I'll probably wait until after the merge window to pick it up. Thanks, M. -- Without deviation from the norm, progress is not possible.