Re: [LSF/MM/BPF TOPIC] MM: Mapcount Madness

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Mon, 29 Jan 2024 13:49:30 +0000

On Mon, Jan 29, 2024 at 01:05:04PM +0100, David Hildenbrand wrote:
> As PTE-mapped large folios become more relevant (mTHP [1]) and there is the
> desire to shrink the metadata allocated for such large folios as well
> (memdesc [2]), how we track folio mappings gets more relevant. Over the
> years, we used folio mapping information to answer various questions: is
> this folio mapped by somebody else? do we have to COW on write fault? how do
> we adjust memory statistics? ...
> 
> Let's talk about ongoing work in the mapcount area, get a common
> understanding of what the users of the different mapcounts are and what the
> implications of removing some would be: which questions could we answer
> differently, which questions would we not be able to answer precisely
> anymore, and what would be the implications of such changes?
> 
> For example, can we tolerate some imprecise memory statistics? How
> expressive is the PSS when large folios are only partially mapped? Would we
> need a transition period and glue changes to a new CONFIG_ option? Do we
> really have to support THP and friends on 32bit?

Excellent topics to cover.  I have some of my own questions ...

Are we in danger of overflowing page refcount too easily?  Pincount
isn't an issue here; we're talking about large folios, so pincount gets
its own field.  But with tracking one mapcount per PTE mapping of a
folio, we can easily increment a PMD-sized folio's refcount by 512
per VMA.  Now we only need 2^22 VMAs to hit the 2^31 limit before the
page->refcount protections go into effect and operations start failing.

How / do we need to track mapcount for pages mapped to userspace which
are neither file-backed, nor anonymous mappings?  eg drivers pass
vmalloc memory to vmf_insert_page() in their ->mmap handler.

What do VM_PFNMAP and VM_MIXEDMAP really imply?  The documentation here
is a little sparse.  And that's sad, because I think we expect device
driver writers to use them, and without clear documentation of what
they actually do, they're going to be misused.