On Tue, Dec 08, 2020 at 05:29:01PM +0000, Joao Martins wrote: > Similar to follow_hugetlb_page() add a follow_devmap_page which rather > than calling follow_page() per 4K page in a PMD/PUD it does so for the > entire PMD, where we lock the pmd/pud, get all pages , unlock. > > While doing so, we only change the refcount once when PGMAP_COMPOUND is > passed in. > > This let us improve {pin,get}_user_pages{,_longterm}() considerably: > > $ gup_benchmark -f /dev/dax0.2 -m 16384 -r 10 -S [-U,-b,-L] -n 512 -w > > (<test>) [before] -> [after] > (get_user_pages 2M pages) ~150k us -> ~8.9k us > (pin_user_pages 2M pages) ~192k us -> ~9k us > (pin_user_pages_longterm 2M pages) ~200k us -> ~19k us > > Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx> > --- > I've special-cased this to device-dax vmas given its similar page size > guarantees as hugetlbfs, but I feel this is a bit wrong. I am > replicating follow_hugetlb_page() as RFC ought to seek feedback whether > this should be generalized if no fundamental issues exist. In such case, > should I be changing follow_page_mask() to take either an array of pages > or a function pointer and opaque arguments which would let caller pick > its structure? I would be extremely sad if this was the only way to do this :( We should be trying to make things more general. The hmm_range_fault_path() doesn't have major special cases for device, I am struggling to understand why gup fast and slow do. What we've talked about is changing the calling convention across all of this to something like: struct gup_output { struct page **cur; struct page **end; unsigned long vaddr; [..] } And making the manipulator like you saw for GUP common: gup_output_single_page() gup_output_pages() Then putting this eveywhere. This is the pattern that we ended up with in hmm_range_fault, and it seems to be working quite well. fast/slow should be much more symmetric in code than they are today, IMHO.. I think those differences mainly exist because it used to be siloed in arch code. Some of the differences might be bugs, we've seen that a few times at least.. Jason