On Wed, Aug 14, 2024 at 10:19:54AM -0300, Jason Gunthorpe wrote: > On Fri, Aug 09, 2024 at 12:08:59PM -0400, Peter Xu wrote: > > > +/** > > + * follow_pfnmap_start() - Look up a pfn mapping at a user virtual address > > + * @args: Pointer to struct @follow_pfnmap_args > > + * > > + * The caller needs to setup args->vma and args->address to point to the > > + * virtual address as the target of such lookup. On a successful return, > > + * the results will be put into other output fields. > > + * > > + * After the caller finished using the fields, the caller must invoke > > + * another follow_pfnmap_end() to proper releases the locks and resources > > + * of such look up request. > > + * > > + * During the start() and end() calls, the results in @args will be valid > > + * as proper locks will be held. After the end() is called, all the fields > > + * in @follow_pfnmap_args will be invalid to be further accessed. > > + * > > + * If the PTE maps a refcounted page, callers are responsible to protect > > + * against invalidation with MMU notifiers; otherwise access to the PFN at > > + * a later point in time can trigger use-after-free. > > + * > > + * Only IO mappings and raw PFN mappings are allowed. > > What does this mean? The paragraph before said this can return a > refcounted page? This came from the old follow_pte(), I kept that as I suppose we should allow VM_IO | VM_PFNMAP just like before, even if in this case I suppose only the pfnmap matters where huge mappings can start to appear. > > > + * The mmap semaphore > > + * should be taken for read, and the mmap semaphore cannot be released > > + * before the end() is invoked. > > This function is not safe for IO mappings and PFNs either, VFIO has a > known security issue to call it. That should be emphasised in the > comment. Any elaboration on this? I could have missed that.. > > The caller must be protected by mmu notifiers or other locking that > guarentees the PTE cannot be removed while the caller is using it. In > all cases. > > Since this hold the PTL until end is it always safe to use the > returned address before calling end? I suppose so? As the pgtable is stable, I thought it means it's safe, but I'm not sure now when you mentioned there's a VFIO known issue, so I could have overlooked something. There's no address returned, but pfn, pgprot, write, etc. The user needs to do proper mapping if they need an usable address, e.g. generic_access_phys() does ioremap_prot() and recheck the pfn didn't change. Thanks, -- Peter Xu