Hello, On Mon, Aug 20, 2018 at 12:06:11PM -0700, Yang Shi wrote: > May the approach #1 break the setting of zone_reclaim_mode? Or it may > behave like zone_reclaim_mode is set even though the knob is cleared? Current MADV_HUGEPAGE THP default behavior is similar to zone/node_reclaim_mode yes, the approach #1 won't change that. The problem is that it behaved like the hardest kind of zone/node_reclaim_mode. It wouldn't even try to stop unmap/writeback. zone/node_reclaim_mode can stop that at least. The approach #1 simply reduces the aggressiveness level from the hardest kind of zone/node_reclaim_mode to something lither than any reclaim would be (i.e. no reclaim and only compaction, which of course only makes sense for order > 0 allocations). If THP fails then the PAGE_SIZE allocation fallback kicks in and it'll spread to all nodes and it will invoke reclaim if needed. If it invokes reclaim, it'll behave according to node_reclaim_mode if set. There's no change to that part. When MADV_HUGEPAGE wasn't used or defrag wasn't set to "always", the current code didn't even invoke compaction, but the whole point of MADV_HUGEPAGE is to try to provide THP from the very first page fault, so it's ok to pay the cost of compaction there because userland told us those are long lived performance sensitive allocations. What MADV_HUGEPAGE can't to is to trigger an heavy swapout of the memory in the local node, despite there may be plenty of free memory in all other nodes (even THP pages) and in the local node in PAGE_SIZE fragments. Thanks, Andrea