Re: [PATCH 0/2] fix for "pathological THP behavior"

Andrea Arcangeli <aarcange@xxxxxxxxxx> · Mon, 20 Aug 2018 19:24:59 -0400

Hello,

On Mon, Aug 20, 2018 at 12:06:11PM -0700, Yang Shi wrote:
> May the approach #1 break the setting of zone_reclaim_mode? Or it may
> behave like zone_reclaim_mode is set even though the knob is cleared?

Current MADV_HUGEPAGE THP default behavior is similar to
zone/node_reclaim_mode yes, the approach #1 won't change that.

The problem is that it behaved like the hardest kind of
zone/node_reclaim_mode. It wouldn't even try to stop unmap/writeback.
zone/node_reclaim_mode can stop that at least.

The approach #1 simply reduces the aggressiveness level from the
hardest kind of zone/node_reclaim_mode to something lither than any
reclaim would be (i.e. no reclaim and only compaction, which of course
only makes sense for order > 0 allocations).

If THP fails then the PAGE_SIZE allocation fallback kicks in and it'll
spread to all nodes and it will invoke reclaim if needed. If it
invokes reclaim, it'll behave according to node_reclaim_mode if
set. There's no change to that part.

When MADV_HUGEPAGE wasn't used or defrag wasn't set to "always", the
current code didn't even invoke compaction, but the whole point of
MADV_HUGEPAGE is to try to provide THP from the very first page fault,
so it's ok to pay the cost of compaction there because userland told
us those are long lived performance sensitive allocations.

What MADV_HUGEPAGE can't to is to trigger an heavy swapout of the
memory in the local node, despite there may be plenty of free memory
in all other nodes (even THP pages) and in the local node in PAGE_SIZE
fragments.

Thanks,
Andrea