On 2022-09-23 12:13, Jason Gunthorpe wrote: > On Thu, Sep 22, 2022 at 10:39:19AM -0600, Logan Gunthorpe wrote: >> GUP Callers that expect PCI P2PDMA pages can now set FOLL_PCI_P2PDMA to >> allow obtaining P2PDMA pages. If GUP is called without the flag and a >> P2PDMA page is found, it will return an error. >> >> FOLL_PCI_P2PDMA cannot be set if FOLL_LONGTERM is set. > > What is causing this? It is really troublesome, I would like to fix > it. eg I would like to have P2PDMA pages in VFIO iommu page tables and > in RDMA MR's - both require longterm. You had said it was required if we were relying on unmap_mapping_range()... https://lore.kernel.org/all/20210928200506.GX3544071@xxxxxxxx/T/#u > Is it just because ZONE_DEVICE was created for DAX and carried that > revocable assumption over? Does anything in your series require > revocable? We still rely on unmap_mapping_range() indirectly in the unbind path. So I expect if something takes a LONGERM mapping that would block until whatever process holds the pin releases it. That's less than ideal and I'm not sure what can be done about it. >> @@ -2383,6 +2392,10 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, >> VM_BUG_ON(!pfn_valid(pte_pfn(pte))); >> page = pte_page(pte); >> >> + if (unlikely(!(flags & FOLL_PCI_P2PDMA) && >> + is_pci_p2pdma_page(page))) >> + goto pte_unmap; >> + >> folio = try_grab_folio(page, 1, flags); >> if (!folio) >> goto pte_unmap; > > On closer look this is not in the right place, we cannot touch the > content of *page without holding a ref, and that doesn't happen until > until try_grab_folio() completes. > > It would be simpler to put this check in try_grab_folio/try_grab_page > after the ref has been obtained. That will naturally cover all the > places that need it. Ok, I can make that change. Logan