On Mon Feb 24, 2025 at 11:56 AM EST, David Hildenbrand wrote: > Let's implement an alternative when per-page mapcounts in large folios are > no longer maintained -- soon with CONFIG_NO_PAGE_MAPCOUNT. > > When computing the output for smaps / smaps_rollups, in particular when > calculating the USS (Unique Set Size) and the PSS (Proportional Set Size), > we still rely on per-page mapcounts. > > To determine private vs. shared, we'll use folio_likely_mapped_shared(), > similar to how we handle PM_MMAP_EXCLUSIVE. Similarly, we might now > under-estimate the USS and count pages towards "shared" that are > actually "private" ("exclusively mapped"). > > When calculating the PSS, we'll now also use the average per-page > mapcount for large folios: this can result in both, an over-estimation > and an under-estimation of the PSS. The difference is not expected to > matter much in practice, but we'll have to learn as we go. > > We can now provide folio_precise_page_mapcount() only with > CONFIG_PAGE_MAPCOUNT, and remove one of the last users of per-page > mapcounts when CONFIG_NO_PAGE_MAPCOUNT is enabled. > > Document the new behavior. > > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> > --- > Documentation/filesystems/proc.rst | 13 +++++++++++++ > fs/proc/internal.h | 8 ++++++++ > fs/proc/task_mmu.c | 17 +++++++++++++++-- > 3 files changed, 36 insertions(+), 2 deletions(-) > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > index 1aa190017f796..57d55274a1f42 100644 > --- a/Documentation/filesystems/proc.rst > +++ b/Documentation/filesystems/proc.rst > @@ -506,6 +506,19 @@ Note that even a page which is part of a MAP_SHARED mapping, but has only > a single pte mapped, i.e. is currently used by only one process, is accounted > as private and not as shared. > > +Note that in some kernel configurations, all pages part of a larger allocation > +(e.g., THP) might be considered "shared" if the large allocation is > +considered "shared": if not all pages are exclusive to the same process. > +Further, some kernel configurations might consider larger allocations "shared", > +if they were at one point considered "shared", even if they would now be > +considered "exclusive". > + > +Some kernel configurations do not track the precise number of times a page part > +of a larger allocation is mapped. In this case, when calculating the PSS, the > +average number of mappings per page in this larger allocation might be used > +as an approximation for the number of mappings of a page. The PSS calculation > +will be imprecise in this case. > + > "Referenced" indicates the amount of memory currently marked as referenced or > accessed. > > diff --git a/fs/proc/internal.h b/fs/proc/internal.h > index 16aa1fd260771..70205425a2daa 100644 > --- a/fs/proc/internal.h > +++ b/fs/proc/internal.h > @@ -143,6 +143,7 @@ unsigned name_to_int(const struct qstr *qstr); > /* Worst case buffer size needed for holding an integer. */ > #define PROC_NUMBUF 13 > > +#ifdef CONFIG_PAGE_MAPCOUNT > /** > * folio_precise_page_mapcount() - Number of mappings of this folio page. > * @folio: The folio. > @@ -173,6 +174,13 @@ static inline int folio_precise_page_mapcount(struct folio *folio, > > return mapcount; > } > +#else /* !CONFIG_PAGE_MAPCOUNT */ > +static inline int folio_precise_page_mapcount(struct folio *folio, > + struct page *page) > +{ > + BUILD_BUG(); > +} > +#endif /* CONFIG_PAGE_MAPCOUNT */ > > /** > * folio_average_page_mapcount() - Average number of mappings per page in this > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index d7ee842367f0f..7ca0bc3bf417d 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -707,6 +707,8 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, > struct folio *folio = page_folio(page); > int i, nr = compound ? compound_nr(page) : 1; > unsigned long size = nr * PAGE_SIZE; > + bool exclusive; > + int mapcount; > > /* > * First accumulate quantities that depend only on |size| and the type > @@ -747,18 +749,29 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page, > dirty, locked, present); > return; > } > + > + if (IS_ENABLED(CONFIG_NO_PAGE_MAPCOUNT)) { > + mapcount = folio_average_page_mapcount(folio); This seems inconsistent with how folio_average_page_mapcount() is used in patch 16 and 18. > + exclusive = !folio_maybe_mapped_shared(folio); > + } > + > /* > * We obtain a snapshot of the mapcount. Without holding the folio lock > * this snapshot can be slightly wrong as we cannot always read the > * mapcount atomically. > */ > for (i = 0; i < nr; i++, page++) { > - int mapcount = folio_precise_page_mapcount(folio, page); > unsigned long pss = PAGE_SIZE << PSS_SHIFT; > + > + if (IS_ENABLED(CONFIG_PAGE_MAPCOUNT)) { > + mapcount = folio_precise_page_mapcount(folio, page); > + exclusive = mapcount < 2; > + } > + > if (mapcount >= 2) > pss /= mapcount; > smaps_page_accumulate(mss, folio, PAGE_SIZE, pss, > - dirty, locked, mapcount < 2); > + dirty, locked, exclusive); > } > } > -- Best Regards, Yan, Zi