On 30.08.23 12:44, Ryan Roberts wrote:
Hi All,
Hi Ryan,
I'll be back from vacation next Wednesday.
Note that I asked David R. to have large anon folios as topic for the
next bi-weekly mm meeting.
There, we should discuss things like
* naming
* accounting (/proc/meminfo)
* required toggles (especially, to ways to disable it, as we want to
keep toggles minimal)
David R. raised that there are certainly workloads where the additional
memory overhead is usually not acceptable. So it will be valuable to get
input from others.
I want to get serious about getting large anon folios merged. To do that, there
are a number of outstanding prerequistes. I'm hoping the respective owners may
be able to provide an update on progress?
I shared some details in the last meeting when you were on vacation :)
High level update below.
[...]
- item:
shared vs exclusive mappings
priority:
prerequisite
description: >-
New mechanism to allow us to easily determine precisely whether a given
folio is mapped exclusively or shared between multiple processes. Required
for (from David H):
(1) Detecting shared folios, to not mess with them while they are shared.
MADV_PAGEOUT, user-triggered page migration, NUMA hinting, khugepaged ...
replace cases where folio_estimated_sharers() == 1 would currently be the
best we can do (and in some cases, page_mapcount() == 1).
(2) COW improvements for PTE-mapped large anon folios after fork(). Before
fork(), PageAnonExclusive would have been reliable, after fork() it's not.
For (1), "MADV_PAGEOUT" maps to the "madvise" item captured in this list. I
*think* "NUMA hinting" maps to "numa balancing" (but need confirmation!).
"user-triggered page migration" and "khugepaged" not yet captured (would
appreciate someone fleshing it out). I previously understood migration to be
working for large folios - is "user-triggered page migration" some specific
aspect that does not work?
For (2), this relates to Large Anon Folio enhancements which I plan to
tackle after we get the basic series merged.
links:
- 'email thread: Mapcount games: "exclusive mapped" vs. "mapped shared"'
location:
- shrink_folio_list()
assignee:
David Hildenbrand <david@xxxxxxxxxx>
Any comment on this David? I think the last comment I saw was that you were
planning to start an implementation a couple of weeks back? Did that get anywhere?
The math should be solid at this point and I had a simple prototype
running -- including fairly clean COW reuse handling.
I started cleaning it all up before my vacation. I'll first need the
total mapcount (which I sent), and might have to implement rmap patching
during THP split (easy), but I first have to do more measurements.
Willies patches to free up space in the first tail page will be
required. In addition, my patches to free up ->private in tail pages for
THP_SWAP. Both things on their way upstream.
Based on that, I need a bit spinlock to protect the total
mapcount+tracking data. There are things to measure (contention) and
optimize (why even care about tracking shared vs. exclusive if it's
pretty guaranteed to always be shared -- for example, shared libraries).
So it looks reasonable at this point, but I'll have to look into
possible contentions and optimizations once I have the basics
implemented cleanly.
It's a shame we cannot get the subpage mapcount out of the way
immediately, then it wouldn't be "additional tracking" but "different
tracking" :)
Once back from vacation, I'm planning on prioritizing this. Shouldn't
take ages to get it cleaned up. Measurements and optimizations might
take a bit longer.
[...]
assignee:
Yin, Fengwei <fengwei.yin@xxxxxxxxx>
As I understand it: initial solution based on folio_estimated_sharers() has gone
into v6.5. Have a dependecy on David's precise shared vs exclusive work for an
shared vs. exclusive in place would replace folio_estimated_sharers()
users and most sub-page mapcount users.
improved solution. And I think you mentioned you are planning to do a change
that avoids splitting a large folio if it is entirely covered by the range?
[..]
- item:
numa balancing
priority:
prerequisite
description: >-
Large, pte-mapped folios are ignored by numa-balancing code. Commit comment
(e81c480): "We're going to have THP mapped with PTEs. It will confuse
numabalancing. Let's skip them for now." Likely depends on "shared vs
exclusive mappings". >>
links: []
location:
- do_numa_page()
assignee:
<none>
Vaguely sounded like David might be planning to tackle this as part of his work
on "shared vs exclusive mappings" ("NUMA hinting"??). David?
It should be easy to handle it based on that. Similarly, khugepaged IIRC.
--
Cheers,
David / dhildenb