Re: A mapcount riddle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 24.01.23 21:56, Mike Kravetz wrote:
Q How can a page be mapped into multiple processes and have a
   mapcount of 1?

A It is a hugetlb page referenced by a shared PMD.

I was looking to expose some basic information about PMD sharing via
/proc/smaps.  After adding the code, I started a couple processes
sharing a large hugetlb mapping that would result in the use of
shared PMDs.  When I looked at the output of /proc/smaps, I saw
my new metric counting the number of shared PMDs.  However, what
stood out was that the entire mapping was listed as Private_Hugetlb.
WTH???  It certainly was shared!  The routine smaps_hugetlb_range
decides between Private_Hugetlb and Shared_Hugetlb with this code:

	if (page) {
		int mapcount = page_mapcount(page);

		if (mapcount >= 2)
			mss->shared_hugetlb += huge_page_size(hstate_vma(vma));
		else
			mss->private_hugetlb += huge_page_size(hstate_vma(vma));
	}

After spending some time looking for issues in the page_mapcount code,
I came to the realization that the mapcount of hugetlb pages only
referenced by a shared PMD would be 1 no matter how many processes had
mapped the page.  When a page is first faulted, the mapcount is set to 1.
When faulted in other processes, the shared PMD is added to the page
table of the other processes.  No increase of mapcount will occur.

At first thought this seems bad.  However, I believe this has been the
behavior since hugetlb PMD sharing was introduced in 2006 and I am
unaware of any reported issues.  I did a audit of code looking at
mapcount.  In addition to the above issue with smaps, there appears
to be an issue with 'migrate_pages' where shared pages could be migrated
without appropriate privilege.

	/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
	if (flags & (MPOL_MF_MOVE_ALL) ||
	    (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
		if (isolate_hugetlb(page, qp->pagelist) &&
			(flags & MPOL_MF_STRICT))
			/*
			 * Failed to isolate page but allow migrating pages
			 * which have been queued.
			 */
			ret = 1;
	}

I will prepare fixes for both of these.  However, I wanted to ask if
anyone has ideas about other potential issues with this?

Since COW is mostly relevant to private mappings, shared PMDs generally
do not apply.  Nothing stood out in a quick audit of code.

Yes, we shouldn't have to worry about anon pages in shared PMDs.

The observed mapcount weirdness is one of the reasons why I suggested for PTE-table sharing (new RFC was posted some time ago, but no time to look into that) to treat sharing of the page table only as a mechanism to deduplicate page table memory -- and to not change the semantics of pages mapped in there. That is: if the page is logically mapped into two page table structures, the refcount and the mapcount would be 2 instead of 1.

Of course, that implies some additional sharing-aware map/unmap logic, because the refcount+mapcount has to be adjusted accordingly.

But PTE-table sharing has to take proper care of private mappings as well, that's more what I was concerned about.

--
Thanks,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux