On 08.05.24 13:39, Daniel Gomez wrote:
On Mon, May 06, 2024 at 04:46:24PM +0800, Baolin Wang wrote:
Anonymous pages have already been supported for multi-size (mTHP) allocation
through commit 19eaf44954df, that can allow THP to be configured through the
sysfs interface located at '/sys/kernel/mm/transparent_hugepage/hugepage-XXkb/enabled'.
However, the anonymous shared pages will ignore the anonymous mTHP rule
configured through the sysfs interface, and can only use the PMD-mapped
THP, that is not reasonable. Many implement anonymous page sharing through
mmap(MAP_SHARED | MAP_ANONYMOUS), especially in database usage scenarios,
therefore, users expect to apply an unified mTHP strategy for anonymous pages,
also including the anonymous shared pages, in order to enjoy the benefits of
mTHP. For example, lower latency than PMD-mapped THP, smaller memory bloat
than PMD-mapped THP, contiguous PTEs on ARM architecture to reduce TLB miss etc.
The primary strategy is similar to supporting anonymous mTHP. Introduce
a new interface '/mm/transparent_hugepage/hugepage-XXkb/shmem_enabled',
which can have all the same values as the top-level
'/sys/kernel/mm/transparent_hugepage/shmem_enabled', with adding a new
additional "inherit" option. By default all sizes will be set to "never"
except PMD size, which is set to "inherit". This ensures backward compatibility
with the shmem enabled of the top level, meanwhile also allows independent
control of shmem enabled for each mTHP.
I'm trying to understand the adoption of mTHP and how it fits into the adoption
of (large) folios that the kernel is moving towards. Can you, or anyone involved
here, explain this? How much do they overlap, and can we benefit from having
both? Is there any argument against the adoption of large folios here that I
might have missed?
mTHP are implemented using large folios, just like traditional PMD-sized
THP are. (you really should explore the history of mTHP and how it all
works internally)
The biggest challenge with memory that cannot be evicted on memory
pressure to be reclaimed (in contrast to your ordinary files in the
pagecache) is memory waste, well, and placement of large chunks of
memory in general, during page faults.
In the worst case (no swap), you allocate a large chunk of memory once
and it will stick around until freed: no reclaim of that memory.
That's the reason why THP for anonymous memory and SHMEM have toggles to
manually enable and configure them, in contrast to the pagecache. The
same was done for mTHP for anonymous memory, and now (anon) shmem follows.
There are plans to have, at some point, have it all working
automatically, but a lot for that for anonymous memory (and shmem
similarly) is still missing and unclear.
--
Cheers,
David / dhildenb