+ hugetlbfs-fix-potential-over-underflow-setting-node-specific-nr_hugepages.patch added to -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Tue, 26 Feb 2019 14:36:26 -0800

The patch titled
     Subject: hugetlbfs: fix potential over/underflow setting node specific nr_hugepages
has been added to the -mm tree.  Its filename is
     hugetlbfs-fix-potential-over-underflow-setting-node-specific-nr_hugepages.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/hugetlbfs-fix-potential-over-underflow-setting-node-specific-nr_hugepages.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/hugetlbfs-fix-potential-over-underflow-setting-node-specific-nr_hugepages.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Subject: hugetlbfs: fix potential over/underflow setting node specific nr_hugepages

The number of node specific huge pages can be set via a file such as:
/sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
When a node specific value is specified, the global number of huge pages
must also be adjusted.  This adjustment is calculated as the specified
node specific value + (global value - current node value).  If the node
specific value provided by the user is large enough, this calculation
could overflow an unsigned long leading to a smaller than expected number
of huge pages.

To fix, check the calculation for overflow.  If overflow is detected, use
ULONG_MAX as the requested value.  This is inline with the user request to
allocate as many huge pages as possible.

It was also noticed that the above calculation was done outside the
hugetlb_lock.  Therefore, the values could be inconsistent and result in
underflow.  To fix, the calculation is moved to within the routine
set_max_huge_pages() where the lock is held.

Link: http://lkml.kernel.org/r/e2bded2f-40ca-c308-5525-0a21777ed221@xxxxxxxxxx
Reported-by: Jing Xiangfeng <jingxiangfeng@xxxxxxxxxx>
Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
Tested-by: Jing Xiangfeng <jingxiangfeng@xxxxxxxxxx>
Acked-by: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx>
Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
Cc: "Kirill A . Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/hugetlb.c |   34 ++++++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 8 deletions(-)

--- a/mm/hugetlb.c~hugetlbfs-fix-potential-over-underflow-setting-node-specific-nr_hugepages
+++ a/mm/hugetlb.c
@@ -2274,7 +2274,7 @@ found:
 }
 
 #define persistent_huge_pages(h) (h->nr_huge_pages - h->surplus_huge_pages)
-static int set_max_huge_pages(struct hstate *h, unsigned long count,
+static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 						nodemask_t *nodes_allowed)
 {
 	unsigned long min_count, ret;
@@ -2289,6 +2289,23 @@ static int set_max_huge_pages(struct hst
 		goto decrease_pool;
 	}
 
+	spin_lock(&hugetlb_lock);
+
+	/*
+	 * Check for a node specific request.  Adjust global count, but
+	 * restrict alloc/free to the specified node.
+	 */
+	if (nid != NUMA_NO_NODE) {
+		unsigned long old_count = count;
+		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
+		/*
+		 * If user specified count causes overflow, set to
+		 * largest possible value.
+		 */
+		if (count < old_count)
+			count = ULONG_MAX;
+	}
+
 	/*
 	 * Increase the pool size
 	 * First take pages out of surplus state.  Then make up the
@@ -2300,7 +2317,6 @@ static int set_max_huge_pages(struct hst
 	 * pool might be one hugepage larger than it needs to be, but
 	 * within all the constraints specified by the sysctls.
 	 */
-	spin_lock(&hugetlb_lock);
 	while (h->surplus_huge_pages && count > persistent_huge_pages(h)) {
 		if (!adjust_pool_surplus(h, nodes_allowed, -1))
 			break;
@@ -2421,16 +2437,18 @@ static ssize_t __nr_hugepages_store_comm
 			nodes_allowed = &node_states[N_MEMORY];
 		}
 	} else if (nodes_allowed) {
+		/* Node specific request */
+		init_nodemask_of_node(nodes_allowed, nid);
+	} else {
 		/*
-		 * per node hstate attribute: adjust count to global,
-		 * but restrict alloc/free to the specified node.
+		 * Node specific request, but we could not allocate
+		 * node mask.  Pass in ALL nodes, and clear nid.
 		 */
-		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
-		init_nodemask_of_node(nodes_allowed, nid);
-	} else
+		nid = NUMA_NO_NODE;
 		nodes_allowed = &node_states[N_MEMORY];
+	}
 
-	err = set_max_huge_pages(h, count, nodes_allowed);
+	err = set_max_huge_pages(h, count, nid, nodes_allowed);
 	if (err)
 		goto out;
 
_

Patches currently in -mm which might be from mike.kravetz@xxxxxxxxxx are

huegtlbfs-fix-races-and-page-leaks-during-migration.patch
huegtlbfs-fix-races-and-page-leaks-during-migration-update.patch
hugetlbfs-fix-potential-over-underflow-setting-node-specific-nr_hugepages.patch