On Fri, Aug 16, 2024 at 10:21:17AM -0400, Peter Xu wrote: > On Fri, Aug 16, 2024 at 11:30:31AM +0200, David Hildenbrand wrote: > > On 14.08.24 15:05, Jason Gunthorpe wrote: > > > On Fri, Aug 09, 2024 at 07:25:36PM +0200, David Hildenbrand wrote: > > > > > > > > > That is in general not what we want, and we still have some places that > > > > > > wrongly hard-code that behavior. > > > > > > > > > > > > In a MAP_PRIVATE mapping you might have anon pages that we can happily walk. > > > > > > > > > > > > vm_normal_page() / vm_normal_page_pmd() [and as commented as a TODO, > > > > > > vm_normal_page_pud()] should be able to identify PFN maps and reject them, > > > > > > no? > > > > > > > > > > Yep, I think we can also rely on special bit. > > > > > > It is more than just relying on the special bit.. > > > > > > VM_PFNMAP/VM_MIXEDMAP should really only be used inside > > > vm_normal_page() because thay are, effectively, support for a limited > > > emulation of the special bit on arches that don't have them. There are > > > a bunch of weird rules that are used to try and make that work > > > properly that have to be followed. > > > > > > On arches with the sepcial bit they should possibly never be checked > > > since the special bit does everything you need. > > > > > > Arguably any place reading those flags out side of vm_normal_page/etc > > > is suspect. > > > > IIUC, your opinion matches mine: VM_PFNMAP/VM_MIXEDMAP and pte_special()/... > > usage should be limited to vm_normal_page/vm_normal_page_pmd/ ... of course, > > GUP-fast is special (one of the reason for "pte_special()" and friends after > > all). > > The issue is at least GUP currently doesn't work with pfnmaps, while > there're potentially users who wants to be able to work on both page + > !page use cases. Besides access_process_vm(), KVM also uses similar thing, > and maybe more; these all seem to be valid use case of reference the vma > flags for PFNMAP and such, so they can identify "it's pfnmap" or more > generic issues like "permission check error on pgtable". Why are those valid compared with calling vm_normal_page() per-page instead? What reason is there to not do something based only on the PFNMAP flag? Jason