On Mon 29-10-18 20:42:53, Balbir Singh wrote: > On Mon, Oct 29, 2018 at 10:00:35AM +0100, Michal Hocko wrote: [...] > > These hugetlb allocations might be disruptive and that is an expected > > behavior because this is an explicit requirement from an admin to > > pre-allocate large pages for the future use. __GFP_RETRY_MAYFAIL just > > underlines that requirement. > > Yes, but in the absence of a particular node, for example via sysctl > (as the compaction does), I don't think it is a hard requirement to get > a page from a particular node. Again this seems like a deliberate decision. You want your distributions as even as possible otherwise the NUMA placement will be much less deterministic. At least that was the case for a long time. If you have different per-node preferences, just use NUMA aware pre-allocation. > I agree we need __GFP_RETRY_FAIL, in any > case the real root cause for me is should_reclaim_continue() which keeps > the task looping without making forward progress. This seems like a separate issue which should better be debugged. Please open a new thread describing the problem and the state of the node. -- Michal Hocko SUSE Labs