On Wed, Dec 09, 2020 at 06:14:06PM +0000, Matthew Wilcox wrote: > > Still, I think it would be easier to teach record_subpages() that a > > PMD doesn't necessarily point to a high order page, eg do something > > like I suggested for the SGL where it extracts the page order and > > iterates over the contiguous range of pfns. > > But we also see good performance improvements from doing all reference > counts on the head page instead of spread throughout the pages, so we > really want compound pages. Oh no doubt! I'm not saying not to do that, just wanting to see some consolidation of the page table reading code. Instead of obtaining and checking the pgmap for PGMAP_COMPOUND (which is unique to devmap and very expensive) do the same algorithm we are talking about for unpin. Given a starting pfn and # of pages following (eg a PMD can be described like this) - compute the minimum list of (compound_head, ntails) tuples that spans that physical range. For instance using your folio language all the gup fast stuff pretty much boils down to: start_page = pmd_page(*pmd); // Select the sub PMD range GUP is interested in npages = adjust_for_vaddr(&start_page, vaddr, vlength, PMD_SHIFT); for_each_folio(start_page, num_pages, &folio, &ntails) { try_grab_folio(folio, ntails) } record_pages_in_output(start_page, npages); No need for all the gup_device* stuff at all. If 'for_each_folio' starts returing high order pages for devmap because the first part of this series made compound_order higher, then great! It also consolidates with the trailing part of gup_hugepte() and more on the gup slow side too. for_each_folio is just some simple maths that does: folio = to_folio(page) ntails = min(1 << folio_order(folio) - (head - page), num_pages) num_pages -= ntails page += ntails Jason