The quilt patch titled Subject: mm/ksm: refactor out try_to_merge_with_zero_page() has been removed from the -mm tree. Its filename was mm-ksm-refactor-out-try_to_merge_with_zero_page.patch This patch was dropped because it was merged into the mm-stable branch of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm ------------------------------------------------------ From: Chengming Zhou <chengming.zhou@xxxxxxxxx> Subject: mm/ksm: refactor out try_to_merge_with_zero_page() Date: Fri, 21 Jun 2024 15:54:29 +0800 Patch series "mm/ksm: cmp_and_merge_page() optimizations and cleanup", v2. This series mainly optimizes cmp_and_merge_page() to have more efficient separate code flow for ksm page and non-ksm anon page. - ksm page: don't need to calculate the checksum obviously. - anon page: don't need to search stable tree if changing fast and try to merge with zero page before searching ksm page on stable tree. Please see the patch-2 for details. Patch-3 is cleanup also a little optimization for the chain()/chain_prune interfaces, which made the stable_tree_search()/stable_tree_insert() over complex. I have done simple testing using "hackbench -g 1 -l 300000" (maybe I need to use a better workload) on my machine, have seen a little CPU usage decrease of ksmd and some improvements of cmp_and_merge_page() latency: We can see the latency of cmp_and_merge_page() when handling non-ksm anon pages has been improved. This patch (of 3): In preparation for later changes, refactor out a new function called try_to_merge_with_zero_page(), which tries to merge with zero page. Link: https://lkml.kernel.org/r/20240621-b4-ksm-scan-optimize-v2-0-1c328aa9e30b@xxxxxxxxx Link: https://lkml.kernel.org/r/20240621-b4-ksm-scan-optimize-v2-1-1c328aa9e30b@xxxxxxxxx Signed-off-by: Chengming Zhou <chengming.zhou@xxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: David Hildenbrand <david@xxxxxxxxxx> Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Stefan Roesch <shr@xxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/hugetlb.c | 1 mm/ksm.c | 70 ++++++++++++++++++++++++++++--------------------- 2 files changed, 40 insertions(+), 31 deletions(-) --- a/mm/hugetlb.c~mm-ksm-refactor-out-try_to_merge_with_zero_page +++ a/mm/hugetlb.c @@ -2666,7 +2666,6 @@ static int gather_surplus_pages(struct h retry: spin_unlock_irq(&hugetlb_lock); for (i = 0; i < needed; i++) { - folio = NULL; for_each_node_mask(node, cpuset_current_mems_allowed) { if (!mbind_nodemask || node_isset(node, *mbind_nodemask)) { folio = alloc_surplus_hugetlb_folio(h, htlb_alloc_mask(h), --- a/mm/ksm.c~mm-ksm-refactor-out-try_to_merge_with_zero_page +++ a/mm/ksm.c @@ -1528,6 +1528,44 @@ out: } /* + * This function returns 0 if the pages were merged or if they are + * no longer merging candidates (e.g., VMA stale), -EFAULT otherwise. + */ +static int try_to_merge_with_zero_page(struct ksm_rmap_item *rmap_item, + struct page *page) +{ + struct mm_struct *mm = rmap_item->mm; + int err = -EFAULT; + + /* + * Same checksum as an empty page. We attempt to merge it with the + * appropriate zero page if the user enabled this via sysfs. + */ + if (ksm_use_zero_pages && (rmap_item->oldchecksum == zero_checksum)) { + struct vm_area_struct *vma; + + mmap_read_lock(mm); + vma = find_mergeable_vma(mm, rmap_item->address); + if (vma) { + err = try_to_merge_one_page(vma, page, + ZERO_PAGE(rmap_item->address)); + trace_ksm_merge_one_page( + page_to_pfn(ZERO_PAGE(rmap_item->address)), + rmap_item, mm, err); + } else { + /* + * If the vma is out of date, we do not need to + * continue. + */ + err = 0; + } + mmap_read_unlock(mm); + } + + return err; +} + +/* * try_to_merge_with_ksm_page - like try_to_merge_two_pages, * but no new kernel page is allocated: kpage must already be a ksm page. * @@ -2302,7 +2340,6 @@ static void stable_tree_append(struct ks */ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_item) { - struct mm_struct *mm = rmap_item->mm; struct ksm_rmap_item *tree_rmap_item; struct page *tree_page = NULL; struct ksm_stable_node *stable_node; @@ -2371,36 +2408,9 @@ static void cmp_and_merge_page(struct pa return; } - /* - * Same checksum as an empty page. We attempt to merge it with the - * appropriate zero page if the user enabled this via sysfs. - */ - if (ksm_use_zero_pages && (checksum == zero_checksum)) { - struct vm_area_struct *vma; + if (!try_to_merge_with_zero_page(rmap_item, page)) + return; - mmap_read_lock(mm); - vma = find_mergeable_vma(mm, rmap_item->address); - if (vma) { - err = try_to_merge_one_page(vma, page, - ZERO_PAGE(rmap_item->address)); - trace_ksm_merge_one_page( - page_to_pfn(ZERO_PAGE(rmap_item->address)), - rmap_item, mm, err); - } else { - /* - * If the vma is out of date, we do not need to - * continue. - */ - err = 0; - } - mmap_read_unlock(mm); - /* - * In case of failure, the page was not really empty, so we - * need to continue. Otherwise we're done. - */ - if (!err) - return; - } tree_rmap_item = unstable_tree_search_insert(rmap_item, page, &tree_page); if (tree_rmap_item) { _ Patches currently in -mm which might be from chengming.zhou@xxxxxxxxx are mm-zswap-use-only-one-pool-in-zswap.patch mm-zswap-use-only-one-pool-in-zswap-v2.patch mm-zswap-use-only-one-pool-in-zswap-v3.patch mm-zsmalloc-fix-class-per-fullness-zspage-counts.patch mm-zsmalloc-move-record_obj-into-obj_malloc.patch