Re: A mapcount riddle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 25, 2023 at 08:22:24AM -0800, James Houghton wrote:
> On Wed, Jan 25, 2023 at 7:54 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
> >
> > On Wed, Jan 25, 2023 at 07:26:49AM -0800, James Houghton wrote:
> > > > At first thought this seems bad.  However, I believe this has been the
> > > > behavior since hugetlb PMD sharing was introduced in 2006 and I am
> > > > unaware of any reported issues.  I did a audit of code looking at
> > > > mapcount.  In addition to the above issue with smaps, there appears
> > > > to be an issue with 'migrate_pages' where shared pages could be migrated
> > > > without appropriate privilege.
> > > >
> > > >         /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
> > > >         if (flags & (MPOL_MF_MOVE_ALL) ||
> > > >             (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
> > > >                 if (isolate_hugetlb(page, qp->pagelist) &&
> > > >                         (flags & MPOL_MF_STRICT))
> > > >                         /*
> > > >                          * Failed to isolate page but allow migrating pages
> > > >                          * which have been queued.
> > > >                          */
> > > >                         ret = 1;
> > > >         }
> > >
> > > This isn't the exact same problem you're fixing Mike, but I want to
> > > point out a related problem.
> > >
> > > This is the generic-mm-equivalent of the hugetlb code above:
> > >
> > > static int migrate_page_add(struct page *page, struct list_head
> > > *pagelist, unsigned long flags)
> > > {
> > >         struct page *head = compound_head(page);
> > >         /*
> > >         * Avoid migrating a page that is shared with others.
> > >         */
> > >         if ((flags & MPOL_MF_MOVE_ALL) || page_mapcount(head) == 1) {
> > >                 if (!isolate_lru_page(head)) {
> > >                         list_add_tail(&head->lru, pagelist);
> > >                         mod_node_page_state(page_pgdat(head),
> > >                                 NR_ISOLATED_ANON + page_is_file_lru(head),
> > >                                 thp_nr_pages(head));
> > > ...
> > > }
> > >

There's also 3 functions in migrate that appear to check for a similar
condition - add_page_for_migration(), numamigrate_isolate_page(), and
migrate_misplaced_page(). 

> > > If you have a partially PTE-mapped THP, page_mapcount(head) will not
> > > accurately determine if a page is mapped in multiple VMAs or not (it
> > > only tells you how many times the head page is mapped).
> > >
> > > For example...
> > > 1) You could have the THP PMD-mapped in one VMA, and then one tail
> > > page of the THP can be mapped in another. page_mapcount(head) will be
> > > 1.
> > > 2) You could have two VMAs map two separate tail pages of the THP, in
> > > which case page_mapcount(head) will be 0.
> > >
> > > I bring this up because we have the same problem with HugeTLB
> > > high-granularity mapping.
> >
> > Maybe a better match here is total_mapcount() rather than page_mapcount()
> > (despite the overheads on the sub-page loop)?
> 
> This would kind of fix the problem, but it would be too conservative now. :)

I agree. Interestingly, numamigrate_isolate_page() does take the
total_mapcount() approach right now, so I'm not sure how much of a
difference being more conservative makes.

> In both example 1 and 2 above, total_mapcount(head) for both would be
> 2, so that's ok. But now consider: you have one VMA that is
> PTE-mapping two pieces of the same THP. total_mapcount(head) is still
> 2, even though only a single VMA is mapping the page.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux