[merged] mm-clear_huge_page-move-order-algorithm-into-a-separate-function.patch removed from -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Mon, 20 Aug 2018 13:56:22 -0700

The patch titled
     Subject: mm, clear_huge_page: move order algorithm into a separate function
has been removed from the -mm tree.  Its filename was
     mm-clear_huge_page-move-order-algorithm-into-a-separate-function.patch

This patch was dropped because it was merged into mainline or a subsystem tree

------------------------------------------------------
From: Huang Ying <ying.huang@xxxxxxxxx>
Subject: mm, clear_huge_page: move order algorithm into a separate function

Patch series "mm, huge page: Copy target sub-page last when copy huge page", v2.

Huge page helps to reduce TLB miss rate, but it has higher cache
footprint, sometimes this may cause some issue.  For example, when copying
huge page on x86_64 platform, the cache footprint is 4M.  But on a Xeon E5
v3 2699 CPU, there are 18 cores, 36 threads, and only 45M LLC (last level
cache).  That is, in average, there are 2.5M LLC for each core and 1.25M
LLC for each thread.

If the cache contention is heavy when copying the huge page, and we copy
the huge page from the begin to the end, it is possible that the begin of
huge page is evicted from the cache after we finishing copying the end of
the huge page.  And it is possible for the application to access the begin
of the huge page after copying the huge page.

In c79b57e462b5d ("mm: hugetlb: clear target sub-page last when clearing
huge page"), to keep the cache lines of the target subpage hot, the order
to clear the subpages in the huge page in clear_huge_page() is changed to
clearing the subpage which is furthest from the target subpage firstly,
and the target subpage last.  The similar order changing helps huge page
copying too.  That is implemented in this patchset.

The patchset is a generic optimization which should benefit quite some
workloads, not for a specific use case.  To demonstrate the performance
benefit of the patchset, we have tested it with vm-scalability run on
transparent huge page.

With this patchset, the throughput increases ~16.6% in vm-scalability
anon-cow-seq test case with 36 processes on a 2 socket Xeon E5 v3 2699
system (36 cores, 72 threads).  The test case set
/sys/kernel/mm/transparent_hugepage/enabled to be always, mmap() a big
anonymous memory area and populate it, then forked 36 child processes,
each writes to the anonymous memory area from the begin to the end, so
cause copy on write.  For each child process, other child processes could
be seen as other workloads which generate heavy cache pressure.  At the
same time, the IPC (instruction per cycle) increased from 0.63 to 0.78,
and the time spent in user space is reduced ~7.2%.


This patch (of 4):

In c79b57e462b5d ("mm: hugetlb: clear target sub-page last when clearing
huge page"), to keep the cache lines of the target subpage hot, the order
to clear the subpages in the huge page in clear_huge_page() is changed to
clearing the subpage which is furthest from the target subpage firstly,
and the target subpage last.  This optimization could be applied to
copying huge page too with the same order algorithm.  To avoid code
duplication and reduce maintenance overhead, in this patch, the order
algorithm is moved out of clear_huge_page() into a separate function:
process_huge_page().  So that we can use it for copying huge page too.

This will change the direct calls to clear_user_highpage() into the
indirect calls.  But with the proper inline support of the compilers, the
indirect call will be optimized to be the direct call.  Our tests show no
performance change with the patch.

This patch is a code cleanup without functionality change.

Link: http://lkml.kernel.org/r/20180524005851.4079-2-ying.huang@xxxxxxxxx
Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
Suggested-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Cc: Andi Kleen <andi.kleen@xxxxxxxxx>
Cc: Jan Kara <jack@xxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Minchan Kim <minchan@xxxxxxxxxx>
Cc: Shaohua Li <shli@xxxxxx>
Cc: Christopher Lameter <cl@xxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/memory.c |   90 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 56 insertions(+), 34 deletions(-)

--- a/mm/memory.c~mm-clear_huge_page-move-order-algorithm-into-a-separate-function
+++ a/mm/memory.c
@@ -4599,71 +4599,93 @@ EXPORT_SYMBOL(__might_fault);
 #endif
 
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)
-static void clear_gigantic_page(struct page *page,
-				unsigned long addr,
-				unsigned int pages_per_huge_page)
-{
-	int i;
-	struct page *p = page;
-
-	might_sleep();
-	for (i = 0; i < pages_per_huge_page;
-	     i++, p = mem_map_next(p, page, i)) {
-		cond_resched();
-		clear_user_highpage(p, addr + i * PAGE_SIZE);
-	}
-}
-void clear_huge_page(struct page *page,
-		     unsigned long addr_hint, unsigned int pages_per_huge_page)
+/*
+ * Process all subpages of the specified huge page with the specified
+ * operation.  The target subpage will be processed last to keep its
+ * cache lines hot.
+ */
+static inline void process_huge_page(
+	unsigned long addr_hint, unsigned int pages_per_huge_page,
+	void (*process_subpage)(unsigned long addr, int idx, void *arg),
+	void *arg)
 {
 	int i, n, base, l;
 	unsigned long addr = addr_hint &
 		~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1);
 
-	if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) {
-		clear_gigantic_page(page, addr, pages_per_huge_page);
-		return;
-	}
-
-	/* Clear sub-page to access last to keep its cache lines hot */
+	/* Process target subpage last to keep its cache lines hot */
 	might_sleep();
 	n = (addr_hint - addr) / PAGE_SIZE;
 	if (2 * n <= pages_per_huge_page) {
-		/* If sub-page to access in first half of huge page */
+		/* If target subpage in first half of huge page */
 		base = 0;
 		l = n;
-		/* Clear sub-pages at the end of huge page */
+		/* Process subpages at the end of huge page */
 		for (i = pages_per_huge_page - 1; i >= 2 * n; i--) {
 			cond_resched();
-			clear_user_highpage(page + i, addr + i * PAGE_SIZE);
+			process_subpage(addr + i * PAGE_SIZE, i, arg);
 		}
 	} else {
-		/* If sub-page to access in second half of huge page */
+		/* If target subpage in second half of huge page */
 		base = pages_per_huge_page - 2 * (pages_per_huge_page - n);
 		l = pages_per_huge_page - n;
-		/* Clear sub-pages at the begin of huge page */
+		/* Process subpages at the begin of huge page */
 		for (i = 0; i < base; i++) {
 			cond_resched();
-			clear_user_highpage(page + i, addr + i * PAGE_SIZE);
+			process_subpage(addr + i * PAGE_SIZE, i, arg);
 		}
 	}
 	/*
-	 * Clear remaining sub-pages in left-right-left-right pattern
-	 * towards the sub-page to access
+	 * Process remaining subpages in left-right-left-right pattern
+	 * towards the target subpage
 	 */
 	for (i = 0; i < l; i++) {
 		int left_idx = base + i;
 		int right_idx = base + 2 * l - 1 - i;
 
 		cond_resched();
-		clear_user_highpage(page + left_idx,
-				    addr + left_idx * PAGE_SIZE);
+		process_subpage(addr + left_idx * PAGE_SIZE, left_idx, arg);
 		cond_resched();
-		clear_user_highpage(page + right_idx,
-				    addr + right_idx * PAGE_SIZE);
+		process_subpage(addr + right_idx * PAGE_SIZE, right_idx, arg);
 	}
 }
 
+static void clear_gigantic_page(struct page *page,
+				unsigned long addr,
+				unsigned int pages_per_huge_page)
+{
+	int i;
+	struct page *p = page;
+
+	might_sleep();
+	for (i = 0; i < pages_per_huge_page;
+	     i++, p = mem_map_next(p, page, i)) {
+		cond_resched();
+		clear_user_highpage(p, addr + i * PAGE_SIZE);
+	}
+}
+
+static void clear_subpage(unsigned long addr, int idx, void *arg)
+{
+	struct page *page = arg;
+
+	clear_user_highpage(page + idx, addr);
+}
+
+void clear_huge_page(struct page *page,
+		     unsigned long addr_hint, unsigned int pages_per_huge_page)
+{
+	unsigned long addr = addr_hint &
+		~(((unsigned long)pages_per_huge_page << PAGE_SHIFT) - 1);
+
+	if (unlikely(pages_per_huge_page > MAX_ORDER_NR_PAGES)) {
+		clear_gigantic_page(page, addr, pages_per_huge_page);
+		return;
+	}
+
+	process_huge_page(addr_hint, pages_per_huge_page, clear_subpage, page);
+}
+
 static void copy_user_gigantic_page(struct page *dst, struct page *src,
 				    unsigned long addr,
 				    struct vm_area_struct *vma,
_

Patches currently in -mm which might be from ying.huang@xxxxxxxxx are

swap-add-comments-to-lock_cluster_or_swap_info.patch
mm-swapfilec-replace-some-ifdef-with-is_enabled.patch
swap-use-swap_count-in-swap_page_trans_huge_swapped.patch
swap-unify-normal-huge-code-path-in-swap_page_trans_huge_swapped.patch
swap-unify-normal-huge-code-path-in-put_swap_page.patch
swap-get_swap_pages-use-entry_size-instead-of-cluster-in-parameter.patch
swap-add-__swap_entry_free_locked.patch
swap-put_swap_page-share-more-between-huge-normal-code-path.patch
mm-swap-fix-race-between-swapoff-and-some-swap-operations.patch
mm-swap-fix-race-between-swapoff-and-some-swap-operations-v6.patch
mm-fix-race-between-swapoff-and-mincore.patch