On Thu, 22 Apr 2021 03:25:23 +0100, Gavin Shan <gshan@xxxxxxxxxx> wrote: > > Hi Keqian, > > On 4/21/21 4:36 PM, Keqian Zhu wrote: > > On 2021/4/21 15:52, Gavin Shan wrote: > >> On 4/16/21 12:03 AM, Keqian Zhu wrote: > >>> The MMIO region of a device maybe huge (GB level), try to use > >>> block mapping in stage2 to speedup both map and unmap. > >>> > >>> Compared to normal memory mapping, we should consider two more > >>> points when try block mapping for MMIO region: > >>> > >>> 1. For normal memory mapping, the PA(host physical address) and > >>> HVA have same alignment within PUD_SIZE or PMD_SIZE when we use > >>> the HVA to request hugepage, so we don't need to consider PA > >>> alignment when verifing block mapping. But for device memory > >>> mapping, the PA and HVA may have different alignment. > >>> > >>> 2. For normal memory mapping, we are sure hugepage size properly > >>> fit into vma, so we don't check whether the mapping size exceeds > >>> the boundary of vma. But for device memory mapping, we should pay > >>> attention to this. > >>> > >>> This adds get_vma_page_shift() to get page shift for both normal > >>> memory and device MMIO region, and check these two points when > >>> selecting block mapping size for MMIO region. > >>> > >>> Signed-off-by: Keqian Zhu <zhukeqian1@xxxxxxxxxx> > >>> --- > >>> arch/arm64/kvm/mmu.c | 61 ++++++++++++++++++++++++++++++++++++-------- > >>> 1 file changed, 51 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > >>> index c59af5ca01b0..5a1cc7751e6d 100644 > >>> --- a/arch/arm64/kvm/mmu.c > >>> +++ b/arch/arm64/kvm/mmu.c > >>> @@ -738,6 +738,35 @@ transparent_hugepage_adjust(struct kvm_memory_slot *memslot, > >>> return PAGE_SIZE; > >>> } > >>> +static int get_vma_page_shift(struct vm_area_struct *vma, unsigned long hva) > >>> +{ > >>> + unsigned long pa; > >>> + > >>> + if (is_vm_hugetlb_page(vma) && !(vma->vm_flags & VM_PFNMAP)) > >>> + return huge_page_shift(hstate_vma(vma)); > >>> + > >>> + if (!(vma->vm_flags & VM_PFNMAP)) > >>> + return PAGE_SHIFT; > >>> + > >>> + VM_BUG_ON(is_vm_hugetlb_page(vma)); > >>> + > >> > >> I don't understand how VM_PFNMAP is set for hugetlbfs related vma. > >> I think they are exclusive, meaning the flag is never set for > >> hugetlbfs vma. If it's true, VM_PFNMAP needn't be checked on hugetlbfs > >> vma and the VM_BUG_ON() becomes unnecessary. > > Yes, but we're not sure all drivers follow this rule. Add a BUG_ON() is > > a way to catch issue. > > > > I think I didn't make things clear. What I meant is VM_PFNMAP can't > be set for hugetlbfs VMAs. So the checks here can be simplified as > below if you agree: > > if (is_vm_hugetlb_page(vma)) > return huge_page_shift(hstate_vma(vma)); > > if (!(vma->vm_flags & VM_PFNMAP)) > return PAGE_SHIFT; > > VM_BUG_ON(is_vm_hugetlb_page(vma)); /* Can be dropped */ No. If this case happens, I want to see it. I have explicitly asked for it, and this check stays. M. -- Without deviation from the norm, progress is not possible.