Re: Folio mapcount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2 Jul 2023, at 7:45, David Hildenbrand wrote:

> On 02.07.23 11:50, Yin, Fengwei wrote:
>>
>>
>> On 7/1/2023 9:17 AM, Zi Yan wrote:
>>> In kernel, almost all code only cares: 1) if a page/folio has extra pins
>>> by checking if mapcount is equal to refcount + extra, and 2)
>>> if a page/folio is mapped multiple times. A single mapcount can meet
>>> these two needs.
>> For 2, how can we know whether a page/folio is mapped multiple times from
>> single mapcount? My understanding is we need two counts as folio could be
>> partial mapped.
>
> Yes, a single mapcount is most probably insufficient. I started analyzing all existing users and use cases, trying to avoid walking page tables.

From my understanding, a single mapcount is sufficient for kernel users, which
calls page_mapcount(). Because they either check mapcount against refcount to
see if a page has extra pin or check mapcount to see if a page is mapped more
than once.

>
> If we want to get rid of all of (most) sub-page mapcounts, we'd probably want:
>
> (1) Total mapcount (compound + any sub-page): page_mapped(), pagecount
>     vs. refcount games, ...

a single mapcount is sufficient in this case.

>
> (2) Compound mapcount (for PMD/PUD-mappale THP only): (2) - (1) tells
>     you if it's only PMD mapped or also PTE-mapped. For example, for
>     statistics but also swapout code.

For statistics, it is for NR_{ANON,FILE}_MAPPED and NR_ANON_THP. I wonder
if we can use the number of anonymous/file pages and THPs instead, without
caring about if it is mapped or not.

For swapout, folio_entire_mapcount() is used to estimate if a THP is fully
mapped or not. I wonder if we can get away with another estimation like
total_mapcount() > folio_nr_pages().

>
> (3) Mapcount of first (or any other) subpage (compount+subpage): for
>     folio_estimated_sharers().

This is another estimation. I wonder if we can use a different estimation
like total_mapcount() > folio_nr_pages() instead.

>
> For anon pages, I'm thinking about remembering an additional
>
> (1) Page/folio creator (MM pointer/identification)
> (2) Page/folio creator mapcount
>
> When optimizing a PTE-mapped THP (especially not- pmd-mappale) for the fork()+exec() case, we'd have to walk page tables to see if all folio references come from this MM. The page/folio creator exactly avoids that completely. We might need a mechanism to synchronize against mapping/unmapping of this folio from the creator concurrently (relevant when mapped into multiple page tables).

creator_mapcount < total_mapcount means multiple MMs map this folio? And this is for
page exclusive check? Sorry I have not checked the code in detail yet. The sync
of creator_mapcount with total_mapcount might have some extra cost. I wonder if
this can be solved by checked num_active_vmas in anon_vma of a folio.

>
>
> Further, for (1) we'd want a 64bit mapcount for large folios, which implies a 64bit refcount. For smallish folios, we don't really care.
>
>
> We should most probably use a bi-weekly MM meeting to discuss that.
>
> Hopefully, I have a full understanding of all use cases and requirements until then. Don't have sufficient time to look into all the nasty details right now.

I agree that we should discuss this more and come up with a detailed list of all use
cases to make sure we do not miss any use case and hopefully simplify the use
of various mapcount if possible.


--
Best Regards,
Yan, Zi

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux