On Wed 14-06-17 17:12:31, Mike Kravetz wrote: > On 06/14/2017 03:12 PM, Mike Kravetz wrote: > > On 06/13/2017 02:00 AM, Michal Hocko wrote: > >> From: Michal Hocko <mhocko@xxxxxxxx> > >> > >> alloc_huge_page_nodemask tries to allocate from any numa node in the > >> allowed node mask starting from lower numa nodes. This might lead to > >> filling up those low NUMA nodes while others are not used. We can reduce > >> this risk by introducing a concept of the preferred node similar to what > >> we have in the regular page allocator. We will start allocating from the > >> preferred nid and then iterate over all allowed nodes in the zonelist > >> order until we try them all. > >> > >> This is mimicking the page allocator logic except it operates on > >> per-node mempools. dequeue_huge_page_vma already does this so distill > >> the zonelist logic into a more generic dequeue_huge_page_nodemask > >> and use it in alloc_huge_page_nodemask. > >> > >> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > >> --- > > > > > > I built attempts/hugetlb-zonelists, threw it on a test machine, ran the > > libhugetlbfs test suite and saw failures. The failures started with this > > patch: commit 7e8b09f14495 in your tree. I have not yet started to look > > into the failures. It is even possible that the tests are making bad > > assumptions, but there certainly appears to be changes in behavior visible > > to the application(s). > > nm. The failures were the result of dequeue_huge_page_nodemask() always > returning NULL. Vlastimil already noticed this issue and provided a > solution. I have pushed my current version to the same branch. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>