Re: A mapcount riddle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 26, 2023 at 1:15 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 25.01.23 17:22, James Houghton wrote:
> > On Wed, Jan 25, 2023 at 7:54 AM Peter Xu <peterx@xxxxxxxxxx> wrote:
> >>
> >> On Wed, Jan 25, 2023 at 07:26:49AM -0800, James Houghton wrote:
> >>>> At first thought this seems bad.  However, I believe this has been the
> >>>> behavior since hugetlb PMD sharing was introduced in 2006 and I am
> >>>> unaware of any reported issues.  I did a audit of code looking at
> >>>> mapcount.  In addition to the above issue with smaps, there appears
> >>>> to be an issue with 'migrate_pages' where shared pages could be migrated
> >>>> without appropriate privilege.
> >>>>
> >>>>          /* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
> >>>>          if (flags & (MPOL_MF_MOVE_ALL) ||
> >>>>              (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
> >>>>                  if (isolate_hugetlb(page, qp->pagelist) &&
> >>>>                          (flags & MPOL_MF_STRICT))
> >>>>                          /*
> >>>>                           * Failed to isolate page but allow migrating pages
> >>>>                           * which have been queued.
> >>>>                           */
> >>>>                          ret = 1;
> >>>>          }
> >>>
> >>> This isn't the exact same problem you're fixing Mike, but I want to
> >>> point out a related problem.
> >>>
> >>> This is the generic-mm-equivalent of the hugetlb code above:
> >>>
> >>> static int migrate_page_add(struct page *page, struct list_head
> >>> *pagelist, unsigned long flags)
> >>> {
> >>>          struct page *head = compound_head(page);
> >>>          /*
> >>>          * Avoid migrating a page that is shared with others.
> >>>          */
> >>>          if ((flags & MPOL_MF_MOVE_ALL) || page_mapcount(head) == 1) {
> >>>                  if (!isolate_lru_page(head)) {
> >>>                          list_add_tail(&head->lru, pagelist);
> >>>                          mod_node_page_state(page_pgdat(head),
> >>>                                  NR_ISOLATED_ANON + page_is_file_lru(head),
> >>>                                  thp_nr_pages(head));
> >>> ...
> >>> }
> >>>
> >>> If you have a partially PTE-mapped THP, page_mapcount(head) will not
> >>> accurately determine if a page is mapped in multiple VMAs or not (it
> >>> only tells you how many times the head page is mapped).
> >>>
> >>> For example...
> >>> 1) You could have the THP PMD-mapped in one VMA, and then one tail
> >>> page of the THP can be mapped in another. page_mapcount(head) will be
> >>> 1.
> >>> 2) You could have two VMAs map two separate tail pages of the THP, in
> >>> which case page_mapcount(head) will be 0.
> >>>
> >>> I bring this up because we have the same problem with HugeTLB
> >>> high-granularity mapping.
> >>
> >> Maybe a better match here is total_mapcount() rather than page_mapcount()
> >> (despite the overheads on the sub-page loop)?
>
> IIRC, total_mapcount() would also not be what we want: for a PTE-mapped
> THP it would be e.g. 512 instead of one. [unless I am confused again
> about mapcounts]
>
> See my other comment, I believe this is supposed to be a guesstimate
> whether "this page is shared". And we use the first subpage to make a
> guess here ...

I tried to dig into the git and review history. It seems like the code
was added when THP migration support was introduced, and was just
copied from the base page case IIRC.

I agree it is a heuristic guess about whether this page is shared or
not, but reading the head page's mapcount is not correct AFAICT. The
total_mapcount should be used, although it can't distinguish unshared
PTE mapped (multiple subpages mapped by PTEs) THP, but it could filter
out shared pages as expected.

>
> Of course, we could try harder, by looking at > 1 subpage, to test if
> any of these subpages has a mapcount > 1. But it still wouldn't be
> accurate ....

A little bit overkilling TBH.

>
> --
> Thanks,
>
> David / dhildenb
>
>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux