On Wed, Feb 01, 2023 at 07:45:17PM -0800, Mike Kravetz wrote: > On 01/24/23 18:13, Matthew Wilcox wrote: > > Once we get to the part of the folio journey where we have > > one-pointer-per-page, we can't afford to maintain per-page state. > > Currently we maintain a per-page mapcount, and that will have to go. > > We can maintain extra state for a multi-page folio, but it has to be a > > constant amount of extra state no matter how many pages are in the folio. > > > > My proposal is that we maintain a single mapcount per folio, and its > > definition is the number of (vma, page table) tuples which have a > > reference to any pages in this folio. > > Hi Matthew, finally took a look at this. Can you clarify your definition of > 'page table' here? I think you are talking about all the entries within > one page table page? Is that correct? It certainly makes sense in this > context. > > I have always thought of page table as the entire tree structure starting at > *pgd in the mm_struct. So, I was a bit confused. But, I now see elsewhere > that 'page table' may refer to either. Yes, we're pretty sloppy about that. What I had in mind was: We have a large folio which is mapped at, say, (1.9MB - 2.1MB) in the user address space. There are thus multiple PTEs which map it and some of those PTEs belong to one PMD and the rest belong to a second PMD. It has a mapcount of 2 due to being mapped by PTE entries belonging to two PMD tables. If it were mapped at (2.1-2.3MB), it would have a mapcount of 1 due to all its PTEs belonging to a single PMD table. [ Mike & I spoke earlier this week about what should happen with mapcount and a theoretical aligned 1GB THP that has its PUD mapping split into PTE mappings. Splitting a PMD to PTEs does not affect the mapcount since all of the PTEs are now referenced from a single PMD table instead of from a PMD entry. But splitting a PUD to PTEs should increment the mapcount by 511 since the folio is now referenced from 512 PMD tables. ]