On 2021-09-28 1:55 p.m., Jason Gunthorpe wrote: > On Thu, Sep 16, 2021 at 05:40:59PM -0600, Logan Gunthorpe wrote: >> +int pci_mmap_p2pmem(struct pci_dev *pdev, struct vm_area_struct *vma) >> +{ >> + struct pci_p2pdma_map *pmap; >> + struct pci_p2pdma *p2pdma; >> + int ret; >> + >> + /* prevent private mappings from being established */ >> + if ((vma->vm_flags & VM_MAYSHARE) != VM_MAYSHARE) { >> + pci_info_ratelimited(pdev, >> + "%s: fail, attempted private mapping\n", >> + current->comm); >> + return -EINVAL; >> + } >> + >> + pmap = pci_p2pdma_map_alloc(pdev, vma->vm_end - vma->vm_start); >> + if (!pmap) >> + return -ENOMEM; >> + >> + rcu_read_lock(); >> + p2pdma = rcu_dereference(pdev->p2pdma); >> + if (!p2pdma) { >> + ret = -ENODEV; >> + goto out; >> + } >> + >> + ret = simple_pin_fs(&pci_p2pdma_fs_type, &pci_p2pdma_fs_mnt, >> + &pci_p2pdma_fs_cnt); >> + if (ret) >> + goto out; >> + >> + ihold(p2pdma->inode); >> + pmap->inode = p2pdma->inode; >> + rcu_read_unlock(); >> + >> + vma->vm_flags |= VM_MIXEDMAP; > > Why is this a VM_MIXEDMAP? Everything fault sticks in here has a > struct page, right? Yes. This decision is not so simple, I tried a few variations before settling on this. The main reason is probably this: if we don't use VM_MIXEDMAP, then we can't set pte_devmap(). If we don't set pte_devmap(), then every single page that GUP processes needs to check if it's a ZONE_DEVICE page and also if it's a P2PDMA page (thus dereferencing pgmap) in order to satisfy the requirements of FOLL_PCI_P2PDMA. I didn't think other developers would go for that kind of performance hit. With VM_MIXEDMAP we hide the performance penalty behind the existing check. And with the current pgmap code as is, we only need to do that check once for every new pgmap pointer. Logan