On Tue, Aug 24, 2021 at 11:44:20AM -0700, Dan Williams wrote: > Yes, that's along the lines of what I'm thinking. I.e don't expect > pte_devmap() to be there in the slow path, and use the vma to check > for DAX. I think we should delete pte_devmap completely from gup.c. It is doing a few things that are better done in more general ways: 1) Doing the get_dev_pagemap() stuff which should be entirely deleted from gup.c in favour of proper use of struct page references. 2) Denying FOLL_LONGTERM Once GUP has grabbed the page we can call is_zone_device_page() on the struct page. If true we can check page->pgmap and read some DENY_FOLL_LONGTERM flag from there 3) Different refcounts for pud/pmd pages Ideally DAX cases would not do this (ie Joao is fixing device-dax) but in the interm we can just loop over the PUD/PMD in all cases. Looping is safe for THP AFAIK. I described how this can work here: https://lore.kernel.org/all/20211013174140.GJ2744544@xxxxxxxxxx/ After that there are only two remaining uses: 4) The pud/pmd_devmap() in vm_normal_page() should just go away. ZONE_DEVICE memory with struct pages SHOULD be a normal page. This also means dropping pte_special too. 5) dev_pagemap_mapping_shift() - I don't know what this does but why not use the is_zone_device_page() approach from 2? In this way ZONE_DEVICE pages can be fully normal pages with no requirements on PTE flags. Where have I gone wrong? :) pud/pmd_devmap() looks a little more involved to remove, but I wonder if we can change logic like this: if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) { Into if (pmd_is_page(*pmd)) ? And rely on struct page based stuff as above to discern THP vs devmap? Thanks, Jason