> On 2/24/19 7:17 PM, David Rientjes wrote: >> On Sun, 24 Feb 2019, Mike Kravetz wrote: >>>> @@ -2423,7 +2423,14 @@ static ssize_t __nr_hugepages_store_common(bool obey_mempolicy, >>>> * per node hstate attribute: adjust count to global, >>>> * but restrict alloc/free to the specified node. >>>> */ >>>> + unsigned long old_count = count; >>>> count += h->nr_huge_pages - h->nr_huge_pages_node[nid]; >>>> + /* >>>> + * If user specified count causes overflow, set to >>>> + * largest possible value. >>>> + */ >>>> + if (count < old_count) >>>> + count = ULONG_MAX; >>>> init_nodemask_of_node(nodes_allowed, nid); >>>> } else >>>> nodes_allowed = &node_states[N_MEMORY]; >>>> >> >> Looks like this fixes the overflow issue, but isn't there already a >> possible underflow since we don't hold hugetlb_lock? Even if >> count == 0, what prevents h->nr_huge_pages_node[nid] being greater than >> h->nr_huge_pages here? I think the per hstate values need to be read with >> READ_ONCE() and stored on the stack to do any sane bounds checking. > > Yes, without holding the lock there is the potential for issues. Looking > back to when the node specific code was added there is a comment about > "re-use/share as much of the existing global hstate attribute initialization > and handling". I suspect that is the reason for these calculations outside > the lock. > > As you mention above, nr_huge_pages_node[nid] could be greater than > nr_huge_pages. This is true even if we do READ_ONCE(). So, the code would > need to test for this condition and 'fix up' values or re-read. It is just > racy without holding the lock. > > If that is too ugly, then we could just add code for the node specific > adjustments. set_max_huge_pages() is only called from here. It would be > pretty easy to modify set_max_huge_pages() to take the node specific value > and do calculations/adjustments under the lock. Ok, what about just moving the calculation/check inside the lock as in the untested patch below? Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> --- mm/hugetlb.c | 34 ++++++++++++++++++++++++++-------- 1 file changed, 26 insertions(+), 8 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1c5219193b9e..5afa77dc7bc8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2274,7 +2274,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed, } #define persistent_huge_pages(h) (h->nr_huge_pages - h->surplus_huge_pages) -static int set_max_huge_pages(struct hstate *h, unsigned long count, +static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid, nodemask_t *nodes_allowed) { unsigned long min_count, ret; @@ -2289,6 +2289,23 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, goto decrease_pool; } + spin_lock(&hugetlb_lock); + + /* + * Check for a node specific request. Adjust global count, but + * restrict alloc/free to the specified node. + */ + if (nid != NUMA_NO_NODE) { + unsigned long old_count = count; + count += h->nr_huge_pages - h->nr_huge_pages_node[nid]; + /* + * If user specified count causes overflow, set to + * largest possible value. + */ + if (count < old_count) + count = ULONG_MAX; + } + /* * Increase the pool size * First take pages out of surplus state. Then make up the @@ -2300,7 +2317,6 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, * pool might be one hugepage larger than it needs to be, but * within all the constraints specified by the sysctls. */ - spin_lock(&hugetlb_lock); while (h->surplus_huge_pages && count > persistent_huge_pages(h)) { if (!adjust_pool_surplus(h, nodes_allowed, -1)) break; @@ -2421,16 +2437,18 @@ static ssize_t __nr_hugepages_store_common(bool obey_mempolicy, nodes_allowed = &node_states[N_MEMORY]; } } else if (nodes_allowed) { + /* Node specific request */ + init_nodemask_of_node(nodes_allowed, nid); + } else { /* - * per node hstate attribute: adjust count to global, - * but restrict alloc/free to the specified node. + * Node specific request, but we could not allocate + * node mask. Pass in ALL nodes, and clear nid. */ - count += h->nr_huge_pages - h->nr_huge_pages_node[nid]; - init_nodemask_of_node(nodes_allowed, nid); - } else + nid = NUMA_NO_NODE; nodes_allowed = &node_states[N_MEMORY]; + } - err = set_max_huge_pages(h, count, nodes_allowed); + err = set_max_huge_pages(h, count, nid, nodes_allowed); if (err) goto out; -- 2.17.2