On 08/30/2018 12:54 AM, Zi Yan wrote: > > Thanks for your patch. > > I tested it against Linus’s tree with “memhog -r3 130g” in a two-socket machine with 128GB memory on > each node and got the results below. I expect this test should fill one node, then fall back to the other. > > 1. madvise(MADV_HUGEPAGE) + defrag = {always, madvise, defer+madvise}: no swap, THPs are allocated in the fallback node. > 2. madvise(MADV_HUGEPAGE) + defrag = defer: pages got swapped to the disk instead of being allocated in the fallback node. Hmm this is GFP_TRANSHUGE_LIGHT | __GFP_KSWAPD_RECLAIM | __GFP_THISNODE. No direct reclaim, so it would have to be kswapd causing the swapping? I wouldn't expect it to be significant and over-reclaiming. What exactly is your definition of "pages got swapped"? > 3. no madvise, THP is on by default + defrag = {always, defer, defer+madvise}: pages got swapped to the disk instead of > being allocated in the fallback node. So this should be the most common case (no madvise, THP on). If it's causing too much reclaim, it's not good IMHO. depending on defrag: defer (the default) = same as above, so it would have to be kswapd always = GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM | __GFP_NORETRY | __GFP_THISNODE - so direct reclaim also overreclaims despite __GFP_NORETRY? defer+madvise = same as defer Vlastimil