Hi Santosh, On 2020/10/22 0:16, Santosh Shukla wrote: > The Commit:6d674e28 introduces a notion to detect and handle the > device mapping. The commit checks for the VM_PFNMAP flag is set > in vma->flags and if set then marks force_pte to true such that > if force_pte is true then ignore the THP function check > (/transparent_hugepage_adjust()). > > There could be an issue with the VM_PFNMAP flag setting and checking. > For example consider a case where the mdev vendor driver register's > the vma_fault handler named vma_mmio_fault(), which maps the > host MMIO region in-turn calls remap_pfn_range() and maps > the MMIO's vma space. Where, remap_pfn_range implicitly sets > the VM_PFNMAP flag into vma->flags. Could you give the name of the mdev vendor driver that triggers this issue? I failed to find one according to your description. Thanks. BRs, Keqian > > Now lets assume a mmio fault handing flow where guest first access > the MMIO region whose 2nd stage translation is not present. > So that results to arm64-kvm hypervisor executing guest abort handler, > like below: > > kvm_handle_guest_abort() --> > user_mem_abort()--> { > > ... > 0. checks the vma->flags for the VM_PFNMAP. > 1. Since VM_PFNMAP flag is not yet set so force_pte _is_ false; > 2. gfn_to_pfn_prot() --> > __gfn_to_pfn_memslot() --> > fixup_user_fault() --> > handle_mm_fault()--> > __do_fault() --> > vma_mmio_fault() --> // vendor's mdev fault handler > remap_pfn_range()--> // Here sets the VM_PFNMAP > flag into vma->flags. > 3. Now that force_pte is set to false in step-2), > will execute transparent_hugepage_adjust() func and > that lead to Oops [4]. > } > > The proposition is to check is_iomap flag before executing the THP > function transparent_hugepage_adjust(). > > [4] THP Oops: >> pc: kvm_is_transparent_hugepage+0x18/0xb0 >> ... >> ... >> user_mem_abort+0x340/0x9b8 >> kvm_handle_guest_abort+0x248/0x468 >> handle_exit+0x150/0x1b0 >> kvm_arch_vcpu_ioctl_run+0x4d4/0x778 >> kvm_vcpu_ioctl+0x3c0/0x858 >> ksys_ioctl+0x84/0xb8 >> __arm64_sys_ioctl+0x28/0x38 > > Tested on Huawei Kunpeng Taishan-200 arm64 server, Using VFIO-mdev device. > Linux tip: 583090b1 > > Fixes: 6d674e28 ("KVM: arm/arm64: Properly handle faulting of device mappings") > Signed-off-by: Santosh Shukla <sashukla@xxxxxxxxxx> > --- > arch/arm64/kvm/mmu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 3d26b47..ff15357 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1947,7 +1947,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > * If we are not forced to use page mapping, check if we are > * backed by a THP and thus use block mapping if possible. > */ > - if (vma_pagesize == PAGE_SIZE && !force_pte) > + if (vma_pagesize == PAGE_SIZE && !force_pte && !is_iomap(flags)) > vma_pagesize = transparent_hugepage_adjust(memslot, hva, > &pfn, &fault_ipa); > if (writable) >