Hi, this has been brought up by Andrea [1] and he proposed two different fixes for the regression. I have proposed an alternative fix [2]. I have changed my mind in the end because whatever fix we end up with it should be backported to the stable trees so going with a minimalistic one is preferred so I have got back to the Andrea's second proposed solution [3] in the end. I have just reworded the changelog to reflect other bug report with the stall information. My primary concern about [3] was that the __GFP_THISNODE logic should be placed in alloc_hugepage_direct_gfpmask which I've done on top of the fix as a cleanup (patch 2) and it doesn't need to be backported to the stable tree. I am still not happy that the David's workload will regress as a result but we should really focus on the default behavior and come with a more robust solution for specialized one for those who have more restrictive NUMA preferences. I am thinking about a new numa policy that would mimic node reclaim behavior and I am willing to work on that but we really have to fix the regression first and that is the patch 1. Thoughts, alternative patches? [1] http://lkml.kernel.org/r/20180820032204.9591-1-aarcange@xxxxxxxxxx [2] http://lkml.kernel.org/r/20180830064732.GA2656@xxxxxxxxxxxxxx [3] http://lkml.kernel.org/r/20180820032640.9896-2-aarcange@xxxxxxxxxx