On 09.08.23 23:23, Peter Xu wrote:
Hi, David,
Some pure questions below..
Hi Peter,
thanks for having a look!
[...]
With sub-PMD THP becoming more important and things looking promising
that we will soon get support for such anon THP, we want to avoid looping
over all pages of a folio just to calculate the total mapcount. Further,
we may soon want to use the total mapcount in other context more
frequently, so prepare for reading it efficiently and atomically.
Any (perhaps existing) discussion on reduced loops vs added atomic
field/ops?
So far it's not been raised as a concern, so no existing discussion.
For order-0 pages the behavior is unchanged.
For PMD-mapped THP and hugetlb it's most certainly noise compared to the
other activities when (un)mapping these large pages.
For PTE-mapped THP, it might be a bit bigger noise, although I doubt it
is really significant (judging from my experience on managing
PageAnonExclusive using set_bit/test_bit/clear_bit when (un)mapping anon
pages).
As folio_add_file_rmap_range() indicates, for PTE-mapped THPs we should
be batching where possible (and Ryan is working on some more rmap
batching). There, managing the subpage mapcount dominates all other
overhead significantly.
I had a feeling that there's some discussion behind the proposal of this
patch, if that's the case it'll be great to attach the link in the commit
log.
There were (mostly offline) discussions on how to sort out some other
issues that PTE-mapped THP are facing, and how to eventually get rid of
the subpage mapcounts (once consumer being _nr_pages_mapped as spelled
out in the patch description). Having a proper total mapcount available
was discussed as one building block.
I don't think I have anything of value to link that would make sense for
the patch as is, as this patch is mostly independent from all that.
--
Cheers,
David / dhildenb