Re: [RFC PATCH 2/4] hugetlb: add support for preferred node to alloc_huge_page_nodemask

Mike Kravetz <mike.kravetz@xxxxxxxxxx> · Wed, 14 Jun 2017 17:12:31 -0700

On 06/14/2017 03:12 PM, Mike Kravetz wrote:
> On 06/13/2017 02:00 AM, Michal Hocko wrote:
>> From: Michal Hocko <mhocko@xxxxxxxx>
>>
>> alloc_huge_page_nodemask tries to allocate from any numa node in the
>> allowed node mask starting from lower numa nodes. This might lead to
>> filling up those low NUMA nodes while others are not used. We can reduce
>> this risk by introducing a concept of the preferred node similar to what
>> we have in the regular page allocator. We will start allocating from the
>> preferred nid and then iterate over all allowed nodes in the zonelist
>> order until we try them all.
>>
>> This is mimicking the page allocator logic except it operates on
>> per-node mempools. dequeue_huge_page_vma already does this so distill
>> the zonelist logic into a more generic dequeue_huge_page_nodemask
>> and use it in alloc_huge_page_nodemask.
>>
>> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
>> ---
> 
> 
> I built attempts/hugetlb-zonelists, threw it on a test machine, ran the
> libhugetlbfs test suite and saw failures.  The failures started with this
> patch: commit 7e8b09f14495 in your tree.  I have not yet started to look
> into the failures.  It is even possible that the tests are making bad
> assumptions, but there certainly appears to be changes in behavior visible
> to the application(s).

nm.  The failures were the result of dequeue_huge_page_nodemask() always
returning NULL.  Vlastimil already noticed this issue and provided a
solution.

-- 
Mike Kravetz

> 
> FYI - My 'test machine' is an x86 KVM insatnce with 8GB memory simulating
> 2 nodes.  Huge page allocations before running tests:
> node0
> 512	free_hugepages
> 512	nr_hugepages
> 0	surplus_hugepages
> node1
> 512	free_hugepages
> 512	nr_hugepages
> 0	surplus_hugepages
> 
> I can take a closer look at the failures tomorrow.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>