Hey, While looking at ZONE_DEVICE struct page reuse particularly the last patch[0], I found two possible improvements for follow_hugetlb_page() which is solely used for get_user_pages()/pin_user_pages(). The first patch batches page refcount updates while the second tidies up storing the subpages/vmas. Both together bring the cost of slow variant of gup() cost from ~86k usecs to ~4.4k usecs. libhugetlbfs tests seem to pass as well gup_test benchmarks with hugetlbfs vmas. [0] https://lore.kernel.org/linux-mm/20201208172901.17384-11-joao.m.martins@xxxxxxxxxx/ Joao Martins (2): mm/hugetlb: grab head page refcount once per group of subpages mm/hugetlb: refactor subpage recording include/linux/mm.h | 3 +++ mm/gup.c | 5 ++-- mm/hugetlb.c | 66 +++++++++++++++++++++++++++------------------- 3 files changed, 44 insertions(+), 30 deletions(-) -- 2.17.1