It's unusual that many emails sent days ago are resurfacing on LKML. Please ignore them. By the way, does anyone know what happened? On Fri, Dec 6, 2024 at 5:12 AM Barry Song <baohua@xxxxxxxxxx> wrote: > > On Thu, Aug 8, 2024 at 10:27 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > > > > On 17/07/2024 08:12, Ryan Roberts wrote: > > > Hi All, > > > > > > This series is an RFC that adds sysfs and kernel cmdline controls to configure > > > the set of allowed large folio sizes that can be used when allocating > > > file-memory for the page cache. As part of the control mechanism, it provides > > > for a special-case "preferred folio size for executable mappings" marker. > > > > > > I'm trying to solve 2 separate problems with this series: > > > > > > 1. Reduce pressure in iTLB and improve performance on arm64: This is a modified > > > approach for the change at [1]. Instead of hardcoding the preferred executable > > > folio size into the arch, user space can now select it. This decouples the arch > > > code and also makes the mechanism more generic; it can be bypassed (the default) > > > or any folio size can be set. For my use case, 64K is preferred, but I've also > > > heard from Willy of a use case where putting all text into 2M PMD-sized folios > > > is preferred. This approach avoids the need for synchonous MADV_COLLAPSE (and > > > therefore faulting in all text ahead of time) to achieve that. > > > > Just a polite bump on this; I'd really like to get something like this merged to > > help reduce iTLB pressure. We had a discussion at the THP Cabal meeting a few > > weeks back without solid conclusion. I haven't heard any concrete objections > > yet, but also only a luke-warm reception. How can I move this forwards? > > Hi Ryan, > > These requirements seem to apply to anon, swap, pagecache, and shmem to > some extent. While the swapin_enabled knob was rejected, the shmem_enabled > option is already in place. > > I wonder if it's possible to use the existing 'enabled' setting across > all cases, as > from an architectural perspective with cont-pte, pagecache may not differ from > anon. The demand for reducing page faults, LRU overhead, etc., also seems > quite similar. > > I imagine that once Android's file systems support mTHP, we’ll uniformly enable > 64KB for anon, swap, shmem, and page cache. It should then be sufficient to > enable all of them using a single knob: > '/sys/kernel/mm/transparent_hugepage/hugepages-xxkB/enabled'. > > Is there anything that makes pagecache and shmem significantly different > from anon? In my Android case, they all seem the same. However, I assume > there might be other use cases where differentiating them is necessary? > > > > > Thanks, > > Ryan > > > > > > > > > > 2. Reduce memory fragmentation in systems under high memory pressure (e.g. > > > Android): The theory goes that if all folios are 64K, then failure to allocate a > > > 64K folio should become unlikely. But if the page cache is allocating lots of > > > different orders, with most allocations having an order below 64K (as is the > > > case today) then ability to allocate 64K folios diminishes. By providing control > > > over the allowed set of folio sizes, we can tune to avoid crucial 64K folio > > > allocation failure. Additionally I've heard (second hand) of the need to disable > > > large folios in the page cache entirely due to latency concerns in some > > > settings. These controls allow all of this without kernel changes. > > > > > > The value of (1) is clear and the performance improvements are documented in > > > patch 2. I don't yet have any data demonstrating the theory for (2) since I > > > can't reproduce the setup that Barry had at [2]. But my view is that by adding > > > these controls we will enable the community to explore further, in the same way > > > that the anon mTHP controls helped harden the understanding for anonymous > > > memory. > > > > > > --- > > > This series depends on the "mTHP allocation stats for file-backed memory" series > > > at [3], which itself applies on top of yesterday's mm-unstable (650b6752c8a3). All > > > mm selftests have been run; no regressions were observed. > > > > > > [1] https://lore.kernel.org/linux-mm/20240215154059.2863126-1-ryan.roberts@xxxxxxx/ > > > [2] https://www.youtube.com/watch?v=ht7eGWqwmNs&list=PLbzoR-pLrL6oj1rVTXLnV7cOuetvjKn9q&index=4 > > > [3] https://lore.kernel.org/linux-mm/20240716135907.4047689-1-ryan.roberts@xxxxxxx/ > > > > > > Thanks, > > > Ryan > > > > > > Ryan Roberts (4): > > > mm: mTHP user controls to configure pagecache large folio sizes > > > mm: Introduce "always+exec" for mTHP file_enabled control > > > mm: Override mTHP "enabled" defaults at kernel cmdline > > > mm: Override mTHP "file_enabled" defaults at kernel cmdline > > > > > > .../admin-guide/kernel-parameters.txt | 16 ++ > > > Documentation/admin-guide/mm/transhuge.rst | 66 +++++++- > > > include/linux/huge_mm.h | 61 ++++--- > > > mm/filemap.c | 26 ++- > > > mm/huge_memory.c | 158 +++++++++++++++++- > > > mm/readahead.c | 43 ++++- > > > 6 files changed, 329 insertions(+), 41 deletions(-) > > > > > > -- > > > 2.43.0 > > > > > > > Thanks > Barry >