On Thu, 21 Mar 2024 18:08:02 -0400 peterx@xxxxxxxxxx wrote: > From: Peter Xu <peterx@xxxxxxxxxx> > > Now follow_page() is ready to handle hugetlb pages in whatever form, and > over all architectures. Switch to the generic code path. > > Time to retire hugetlb_follow_page_mask(), following the previous > retirement of follow_hugetlb_page() in 4849807114b8. > > There may be a slight difference of how the loops run when processing slow > GUP over a large hugetlb range on cont_pte/cont_pmd supported archs: each > loop of __get_user_pages() will resolve one pgtable entry with the patch > applied, rather than relying on the size of hugetlb hstate, the latter may > cover multiple entries in one loop. > > A quick performance test on an aarch64 VM on M1 chip shows 15% degrade over > a tight loop of slow gup after the path switched. That shouldn't be a > problem because slow-gup should not be a hot path for GUP in general: when > page is commonly present, fast-gup will already succeed, while when the > page is indeed missing and require a follow up page fault, the slow gup > degrade will probably buried in the fault paths anyway. It also explains > why slow gup for THP used to be very slow before 57edfcfd3419 ("mm/gup: > accelerate thp gup even for "pages != NULL"") lands, the latter not part of > a performance analysis but a side benefit. If the performance will be a > concern, we can consider handle CONT_PTE in follow_page(). > > Before that is justified to be necessary, keep everything clean and simple. > mm/gup.c:33:21: warning: 'follow_hugepd' declared 'static' but never defined [-Wunused-function] 33 | static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, | ^~~~~~~~~~~~~ --- a/mm/gup.c~mm-gup-handle-hugepd-for-follow_page-fix +++ a/mm/gup.c @@ -30,10 +30,12 @@ struct follow_page_context { unsigned int page_mask; }; +#ifdef CONFIG_HAVE_FAST_GUP static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, unsigned long addr, unsigned int pdshift, unsigned int flags, struct follow_page_context *ctx); +#endif static inline void sanity_check_pinned_pages(struct page **pages, unsigned long npages) _ This looks inelegant. That's two build issues so far. Please be more expansive in the Kconfig variations when testing. Especially when mucking with pgtable macros.