Re: Folio mapcount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 08, 2023 at 02:36:41PM -0500, Zi Yan wrote:
> On 7 Feb 2023, at 11:51, Matthew Wilcox wrote:
> 
> > On Tue, Feb 07, 2023 at 11:23:31AM -0500, Zi Yan wrote:
> >> On 24 Jan 2023, at 13:13, Matthew Wilcox wrote:
> >>
> >>> Once we get to the part of the folio journey where we have
> >>> one-pointer-per-page, we can't afford to maintain per-page state.
> >>> Currently we maintain a per-page mapcount, and that will have to go.
> >>> We can maintain extra state for a multi-page folio, but it has to be a
> >>> constant amount of extra state no matter how many pages are in the folio.
> >>>
> >>> My proposal is that we maintain a single mapcount per folio, and its
> >>> definition is the number of (vma, page table) tuples which have a
> >>> reference to any pages in this folio.
> >>
> >> How about having two, full_folio_mapcount and partial_folio_mapcount?
> >> If partial_folio_mapcount is 0, we can have a fast path without doing
> >> anything at page level.
> >
> > A fast path for what?  I don't understand your vision; can you spell it
> > out for me?  My current proposal is here:
> 
> A fast code path for only handling folios as a whole. For cases that
> subpages are mapped from a folio, traversing through subpages might be
> needed and will be slow. A code separation might be cleaner and makes
> folio as a whole handling quicker.

To be clear, in this proposal, there is no subpage mapcount.  I've got
my eye on one struct folio per allocation, so there will be no more
tail pages.  The proposal has one mapcount, and that's it.  I'd be
open to saying "OK, we need two mapcounts", but not to anything that
needs to scale per number of pages in the folio.

> For your proposal, "How many VMAs have one-or-more pages of this folio mapped"
> should be the responsibility of rmap. We could add a counter to rmap
> instead. It seems that you are mixing page table mapping with virtual
> address space (VMA) mapping together.

rmap tells you how many VMAs cover this folio.  It doesn't tell you
how many of those VMAs have actually got any pages from it mapped.
It's also rather slower than a simple atomic_read(), so I think
you'll have an uphill battle trying to convince people to use rmap
for this purpose.

I'm not sure what you mean by "add a counter to rmap"?  One count
per mapped page in the vma?

> >
> > https://lore.kernel.org/linux-mm/Y+FkV4fBxHlp6FTH@xxxxxxxxxxxxxxxxxxxx/
> >
> > The three questions we need to be able to answer (in my current
> > understanding) are laid out here:
> >
> > https://lore.kernel.org/linux-mm/Y+HblAN5bM1uYD2f@xxxxxxxxxxxxxxxxxxxx/
> 
> I think we probably need to clarify the definition of "map" in your
> questions. Does it mean mapped by page tables or VMAs? When a page
> is mapped into a VMA, it can be mapped by one or more page table entries,
> but not the other way around, right? Or is shared page table entry merged
> now so that more than one VMAs can use a single page table entry to map
> a folio?

Mapped by page tables, just like today.  It'd be quite the change to
figure out the mapcount of a page newly brought into the page cache;
we'd have to do an rmap walk to see how many mapcounts to give it.
I don't think this is a great idea.

As far as I know, shared page tables are only supported by hugetlbfs,
and I prefer to stick cheese in my ears and pretend they don't exist.

To be absolutely concrete about this, my proposal is:

Folio brought into page cache has mapcount 0 (whether or not there are any VMAs
that cover it)
When we take a page fault on one of the pages in it, its mapcount
increases from 0 to 1.
When we take another page fault on a page in it, we do a pvmw to
determine if any pages from this folio are already mapped by this VMA;
we see that there is one and we do not increment the mapcount.
We partially munmap() so that we need to unmap one of the pages.
We remove it from the page tables and call page_remove_rmap().
That does another pvmw and sees there's still a page in this folio
mapped by this VMA, does not decrement the refcount
We truncate() the file smaller than the position of the folio, which
causes us to unmap the rest of the folio.  The pvmw walk detects no
more pages from this folio mapped and we decrement the mapcount.

Clear enough?




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux