On 04/12/2023 12:43, David Hildenbrand wrote: > On 04.12.23 13:39, Ryan Roberts wrote: >> On 04/12/2023 12:28, David Hildenbrand wrote: >>> On 04.12.23 13:26, Ryan Roberts wrote: >>>>>>> >>>>>>> Also, struct page (memmap) might not be always contiguous, using struct page >>>>>>> points to represent folio range might not give the result you want. >>>>>>> See nth_page() and folio_page_idx() in include/linux/mm.h. >>>>>> >>>>>> Is that true for pages within the same folio too? Or are all pages in a folio >>>>>> guarranteed contiguous? Perhaps I'm better off using pfn? >>>>> >>>>> folio_page_idx() says not all pages in a folio is guaranteed to be contiguous. >>>>> PFN might be a better choice. >>>> >>>> Hi Zi, Matthew, >>>> >>>> Zi made this comment a couple of months back that it is incorrect to assume >>>> that >>>> `struct page`s within a folio are (virtually) contiguous. I'm not sure if >>>> that's >>>> really the case though? I see other sites in the source that do page++ when >>>> iterating over a folio. e.g. smaps_account(), splice_folio_into_pipe(), >>>> __collapse_huge_page_copy(), etc. >>>> >>>> Any chance someone could explain the rules? >>> >>> With the vmemmap, they are contiguous. Without a vmemmap, but with sparsemem, we >>> might end up allocating one memmap chunk per memory section (e.g., 128 MiB). >>> >>> So, for example, a 1 GiB hugetlb page could cross multiple 128 MiB sections, and >>> therefore, the memmap might not be virtually consecutive. >> >> OK, is a "memory section" always 128M or is it variable? If fixed, does that >> mean that it's impossible for a THP to cross section boundaries? (because a THP >> is always smaller than a section?) > > Section size is variable (see SECTION_SIZE_BITS), but IIRC, buddy allocations > will never cross them. > >> >> Trying to figure out why my original usage in this series was wrong, but >> presumably the other places that I mentioned are safe. > > If only dealing with buddy allocations, *currently* it might always fall into a > single memory section. OK that makes sense - thanks! >