Re: Folio mapcount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 24, 2023 at 06:13:21PM +0000, Matthew Wilcox wrote:
> Once we get to the part of the folio journey where we have 
> one-pointer-per-page, we can't afford to maintain per-page state.
> Currently we maintain a per-page mapcount, and that will have to go. 
> We can maintain extra state for a multi-page folio, but it has to be a
> constant amount of extra state no matter how many pages are in the folio.
> 
> My proposal is that we maintain a single mapcount per folio, and its
> definition is the number of (vma, page table) tuples which have a
> reference to any pages in this folio.

I've been thinking about this a lot more, and I have changed my
mind.  It works fine to answer the question "Is any page in this
folio mapped", but it's now hard to answer the question "I have it
mapped, does anybody else?"  That question is asked, for example,
in madvise_cold_or_pageout_pte_range().

With this definition, if the mapcount is 1, it's definitely only mapped
by us.  If it's more than 2, it's definitely mapped by somebody else (*).
If it's 2, maybe we have the folio mapped twice, and maybe we have it
mapped once and somebody else has it mapped once, so we have to consult
the rmap to find out.  Not fun times.

(*) If we support folios larger than PMD size, then the answer is more
complex.

I now think the mapcount has to be defined as "How many VMAs have
one-or-more pages of this folio mapped".

That means that our future folio_add_file_rmap_range() looks a bit
like this:

{
	bool add_mapcount = true;

	if (nr < folio_nr_pages(folio))
		add_mapcount = !folio_has_ptes(folio, vma);
	if (add_mapcount)
		atomic_inc(&folio->_mapcount);

	__lruvec_stat_mod_folio(folio, NR_FILE_MAPPED, nr);
	if (nr == HPAGE_PMD_NR)
		__lruvec_stat_mod_folio(folio, folio_test_swapbacked(folio) ?
			NR_SHMEM_PMDMAPPED : NR_FILE_PMDMAPPED, nr);

	mlock_vma_folio(folio, vma, nr == HPAGE_PMD_NR);
}

bool folio_mapped_in_vma(struct folio *folio, struct vm_area_struct *vma)
{
	unsigned long address = vma_address(&folio->page, vma);
	DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0);

	if (!page_vma_mapped_walk(&pvmw))
		return false;
	page_vma_mapped_walk_done(&pvmw);
	return true;
}

... some details to be fixed here; particularly this will currently
deadlock on the PTL, so we'd need not only to exclude the current
PMD from being examined, but also avoid a deadly embrace between
two threads (do we currently have a locking order defined for
page table locks at the same height of the tree?)





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux