Re: A mapcount riddle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 24, 2023 at 03:35:38PM -0800, Mike Kravetz wrote:
> On 01/24/23 18:00, Peter Xu wrote:
> > On Tue, Jan 24, 2023 at 12:56:24PM -0800, Mike Kravetz wrote:
> > > At first thought this seems bad.  However, I believe this has been the
> > > behavior since hugetlb PMD sharing was introduced in 2006 and I am
> > > unaware of any reported issues.  I did a audit of code looking at
> > > mapcount.  In addition to the above issue with smaps, there appears
> > > to be an issue with 'migrate_pages' where shared pages could be migrated
> > > without appropriate privilege.
> > > 
> > > 	/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
> > > 	if (flags & (MPOL_MF_MOVE_ALL) ||
> > > 	    (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
> > > 		if (isolate_hugetlb(page, qp->pagelist) &&
> > > 			(flags & MPOL_MF_STRICT))
> > > 			/*
> > > 			 * Failed to isolate page but allow migrating pages
> > > 			 * which have been queued.
> > > 			 */
> > > 			ret = 1;
> > > 	}
> > > 
> > > I will prepare fixes for both of these.  However, I wanted to ask if
> > > anyone has ideas about other potential issues with this?
> > 
> > This reminded me whether things should be checked already before this
> > happens.  E.g. when trying to share pmd, whether it makes sense to check
> > vma mempolicy before doing so?
> 
> Not sure I understand your question.  Are you questioning whether we should
> enter into pmd sharing if mempolicy allows movement to another node?
> Wouldn't this be the 'normal' case on a multi-node system?
> 
> > Then the question is if pmd sharing only happens with the vma that shares
> > the same memory policy, whether above mapcount==1 check would be acceptable
> > even if it's shared by multiple processes.
> 
> I am not a mempolicy expert, but that would still involve moving pages
> mapped by another process.  For that CAP_SYS_NICE is required.  So, my
> opinion would be that it is not allowed even if mempolicy is the same.

Makes sense.

> 
> > Besides, I'm also curious on the planned fix too regarding the two issues
> > mentioned.
> 
> My planned 'fix' is to simply check for shared a PMD
> (page_count(virt_to_page(pte))) to determine if page with mapcount == 1
> is shared.

I think having the current pte* won't easily work, we'll need to walk all
the pgtable that mapped this page.

To be explicit, one page can be mapped at pgtable1 which is shared by proc1
& proc2, and it can also be mapped at pgtable2 which is shared by proc3 &
proc4.  Then (assuming pte1* points to pgtable1):

  page_count(virt_to_page(pte1)) + page_mapcount(page)

Won't be the right mapcount we're looking for.

But then if we're going to rmap walk all the mappings it seems even easier
to just count how many times the page is mapped when walking, rather than
relying on page_mapcount.

What David said on maintaining map/ref counts during sharing is another
approach, I'm wondering whether there's any better way to do this.

Thanks,

-- 
Peter Xu





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux