On 21.12.23 21:40, Zach O'Keefe wrote:
Hey Xu,
Thanks for the patches.
As a precursor, can you help understand what the use case is for these
patches? In-place collapse of anon memory is something I've thought
about before, but the opportunity has never been especially clear.
In particular, your patches take an order-9 compound page, and just
try to see if we can update the mappings to it (like we do with
file/shmem). Functionally this seems fine, but the difference is that
with file/shmem, it's quite easy to have a pte-mapped-hugepage arise
naturally (the formation of the hugepage happening in the pagecache
being logically separate from the pmd-mapping of w/e task is mapping
it).\
For anonymous memory, the only time I can see us having a pte-mapped
hugepage (that isn't destined for splitting on deferred split list)
that we want to remap by a pmd is if we cause a VMA split + remerge by
mucking with VMA attributes.
Yes, mostly because of madvise(), mprotect(), mremap(). But also, when
putting a THP into the swap cache right now. When refaulting, you get a
PTE-mapped THP.
There are some other odd cases, and there might be more in the future
(below)
In my mind, what I had been thinking of w.r.t in-place anon collapse
was for the case where we've split a THP with MADV_FREE/MADV_DONTNEED
(i.e. to subrelease pages back to kernel), but later want to reform
the THP. In particular, if, for example, we only subrelease O(10s) of
Right, and in-place collapse even works if the folio has been pinned,
which is nice.
order-0 pages, it seems wasteful to have to reallocate a fresh
hugepage, then copy over O(100s) of pages, on collapse. If we were
able to attempt to first migrate-away any of those previously
subreleased pages (now possibly backing some other memory entirely),
it could save us from having to allocate a fresh order-9 page. Under
memory pressure / fragmentation, this could mean the difference
between success and failure.
One thing that popped up a couple of times already is that we might want
to PTE-map a PMD-sized THP for a couple of reasons (IIRC, FreeBSD does
some of that). For example:
* Lazily zero the pages of the folio on demand, keeping all non-zeroed
parts protnone. At a certain time (e.g., all zeroed), simply remap
using a PMD.
* Detecting sub-page access by temporarily mapping the THP using PTEs.
Maybe, also some uffd optimizations, whereby protnone parts are not
faulted in yet.
--
Cheers,
David / dhildenb