The patch titled Subject: mm: add sysfs entry to disable splitting underused THPs has been added to the -mm mm-unstable branch. Its filename is mm-add-sysfs-entry-to-disable-splitting-underused-thps.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-add-sysfs-entry-to-disable-splitting-underused-thps.patch This patch will later appear in the mm-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Usama Arif <usamaarif642@xxxxxxxxx> Subject: mm: add sysfs entry to disable splitting underused THPs Date: Fri, 30 Aug 2024 11:03:40 +0100 If disabled, THPs faulted in or collapsed will not be added to _deferred_list, and therefore won't be considered for splitting under memory pressure if underused. Link: https://lkml.kernel.org/r/20240830100438.3623486-7-usamaarif642@xxxxxxxxx Signed-off-by: Usama Arif <usamaarif642@xxxxxxxxx> Cc: Alexander Zhu <alexlzhu@xxxxxx> Cc: Barry Song <baohua@xxxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Domenico Cerasuolo <cerasuolodomenico@xxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Jonathan Corbet <corbet@xxxxxxx> Cc: Kairui Song <ryncsn@xxxxxxxxx> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> Cc: Mike Rapoport <rppt@xxxxxxxxxx> Cc: Nico Pache <npache@xxxxxxxxxx> Cc: Rik van Riel <riel@xxxxxxxxxxx> Cc: Roman Gushchin <roman.gushchin@xxxxxxxxx> Cc: Ryan Roberts <ryan.roberts@xxxxxxx> Cc: Shakeel Butt <shakeel.butt@xxxxxxxxx> Cc: Shuang Zhai <zhais@xxxxxxxxxx> Cc: Yu Zhao <yuzhao@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- Documentation/admin-guide/mm/transhuge.rst | 10 +++++++ mm/huge_memory.c | 26 +++++++++++++++++++ 2 files changed, 36 insertions(+) --- a/Documentation/admin-guide/mm/transhuge.rst~mm-add-sysfs-entry-to-disable-splitting-underused-thps +++ a/Documentation/admin-guide/mm/transhuge.rst @@ -202,6 +202,16 @@ PMD-mappable transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size +All THPs at fault and collapse time will be added to _deferred_list, +and will therefore be split under memory presure if they are considered +"underused". A THP is underused if the number of zero-filled pages in +the THP is above max_ptes_none (see below). It is possible to disable +this behaviour by writing 0 to shrink_underused, and enable it by writing +1 to it:: + + echo 0 > /sys/kernel/mm/transparent_hugepage/shrink_underused + echo 1 > /sys/kernel/mm/transparent_hugepage/shrink_underused + khugepaged will be automatically started when PMD-sized THP is enabled (either of the per-size anon control or the top-level control are set to "always" or "madvise"), and it'll be automatically shutdown when --- a/mm/huge_memory.c~mm-add-sysfs-entry-to-disable-splitting-underused-thps +++ a/mm/huge_memory.c @@ -74,6 +74,7 @@ static unsigned long deferred_split_coun struct shrink_control *sc); static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc); +static bool split_underused_thp = true; static atomic_t huge_zero_refcount; struct folio *huge_zero_folio __read_mostly; @@ -440,6 +441,27 @@ static ssize_t hpage_pmd_size_show(struc static struct kobj_attribute hpage_pmd_size_attr = __ATTR_RO(hpage_pmd_size); +static ssize_t split_underused_thp_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%d\n", split_underused_thp); +} + +static ssize_t split_underused_thp_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int err = kstrtobool(buf, &split_underused_thp); + + if (err < 0) + return err; + + return count; +} + +static struct kobj_attribute split_underused_thp_attr = __ATTR( + shrink_underused, 0644, split_underused_thp_show, split_underused_thp_store); + static struct attribute *hugepage_attr[] = { &enabled_attr.attr, &defrag_attr.attr, @@ -448,6 +470,7 @@ static struct attribute *hugepage_attr[] #ifdef CONFIG_SHMEM &shmem_enabled_attr.attr, #endif + &split_underused_thp_attr.attr, NULL, }; @@ -3601,6 +3624,9 @@ void deferred_split_folio(struct folio * if (folio_order(folio) <= 1) return; + if (!partially_mapped && !split_underused_thp) + return; + /* * The try_to_unmap() in page reclaim path might reach here too, * this may cause a race condition to corrupt deferred split queue. _ Patches currently in -mm which might be from usamaarif642@xxxxxxxxx are revert-mm-skip-cma-pages-when-they-are-not-available.patch revert-mm-skip-cma-pages-when-they-are-not-available-update.patch mm-store-zero-pages-to-be-swapped-out-in-a-bitmap.patch mm-remove-code-to-handle-same-filled-pages.patch mm-introduce-a-pageflag-for-partially-mapped-folios.patch mm-split-underused-thps.patch mm-add-sysfs-entry-to-disable-splitting-underused-thps.patch