On 8/3/19 12:39 AM, Mike Kravetz wrote: > When allocating hugetlbfs pool pages via /proc/sys/vm/nr_hugepages, > the pages will be interleaved between all nodes of the system. If > nodes are not equal, it is quite possible for one node to fill up > before the others. When this happens, the code still attempts to > allocate pages from the full node. This results in calls to direct > reclaim and compaction which slow things down considerably. > > When allocating pool pages, note the state of the previous allocation > for each node. If previous allocation failed, do not use the > aggressive retry algorithm on successive attempts. The allocation > will still succeed if there is memory available, but it will not try > as hard to free up memory. > > Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> Looks like only part of the (agreed with) suggestions were implemented? - set_max_huge_pages() returns -ENOMEM if nodemask can't be allocated, but hugetlb_hstate_alloc_pages() doesn't. - there's still __GFP_NORETRY in nodemask allocations - (cosmetics) Mel pointed out that NODEMASK_FREE() works fine with NULL pointers Thanks, Vlastimil