On 04.07.24 06:30, Oscar Salvador wrote:
Hi all,
During Peter's talk at the LSFMM, it was agreed that one of the things
that need to be done in order to further integrate hugetlb into mm core,
is to unify generic and hugetlb pagewalkers.
I started with this one, which is unifying hugetlb into generic
pagewalk, instead of having its hugetlb_entry entries.
Which means that pmd_entry/pte_entry(for cont-pte) entries will also deal with
hugetlb vmas as well, and so will new pud_entry entries since hugetlb can be
pud mapped (devm pages as well but we seem not to care about those with
the exception of hmm code).
The outcome is this RFC.
First of all, a good step into the right direction, but maybe not what
we want long-term. So I'm questioning whether we want this intermediate
approach. walk_page_range() and friends are simply not a good design
(e.g., indirect function calls).
There are roughly two categories of page table walkers we have:
1) We actually only want to walk present folios (to be precise, page
ranges of folios). We should look into moving away from the walk the
page walker API where possible, and have something better that
directly gives us the folio (page ranges). Any PTE batching would be
done internally.
2) We want to deal with non-present folios as well (swp entries and all
kinds of other stuff). We should maybe implement our custom page
table walker and move away from walk_page_range(). We are not walking
"pages" after all but everything else included :)
Then, there is a subset of 1) where we only want to walk to a single
address (a single folio). I'm working on that right now to get rid of
follow_page() and some (IIRC 3: KSM an daemon) walk_page_range() users.
Hugetlb will still remain a bit special, but I'm afraid we cannot hide
that completely.
--
Cheers,
David / dhildenb