On Fri, Feb 05, 2021 at 08:41:25PM +0000, Joao Martins wrote: > Rather than decrementing the head page refcount one by one, we > walk the page array and checking which belong to the same > compound_head. Later on we decrement the calculated amount > of references in a single write to the head page. To that > end switch to for_each_compound_head() does most of the work. > > set_page_dirty() needs no adjustment as it's a nop for > non-dirty head pages and it doesn't operate on tail pages. > > This considerably improves unpinning of pages with THP and > hugetlbfs: > > - THP > gup_test -t -m 16384 -r 10 [-L|-a] -S -n 512 -w > PIN_LONGTERM_BENCHMARK (put values): ~87.6k us -> ~23.2k us > > - 16G with 1G huge page size > gup_test -f /mnt/huge/file -m 16384 -r 10 [-L|-a] -S -n 512 -w > PIN_LONGTERM_BENCHMARK: (put values): ~87.6k us -> ~27.5k us > > Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx> > Reviewed-by: John Hubbard <jhubbard@xxxxxxxxxx> > --- > mm/gup.c | 29 +++++++++++------------------ > 1 file changed, 11 insertions(+), 18 deletions(-) Looks fine Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx> I was wondering why this only touches the FOLL_PIN path, it would make sense to also use this same logic for release_pages() for (i = 0; i < nr; i++) { struct page *page = pages[i]; page = compound_head(page); if (is_huge_zero_page(page)) continue; Jason