On 03/02/2016 01:24 PM, Michal Hocko wrote:
On Tue 01-03-16 19:14:08, Vlastimil Babka wrote:
I was under impression that similar checks to compaction_suitable() were
done also in compact_finished(), to stop compacting if memory got low due to
parallel activity. But I guess it was a patch from Joonsoo that didn't get
merged.
My only other theory so far is that watermark checks fail in
__isolate_free_page() when we want to grab page(s) as migration targets.
yes this certainly contributes to the problem and triggered in my case a
lot:
$ grep __isolate_free_page trace.log | wc -l
181
$ grep __alloc_pages_direct_compact: trace.log | wc -l
7
I would suggest enabling all compaction tracepoint and the migration
tracepoint. Looking at the trace could hopefully help faster than
going one trace_printk() per attempt.
OK, here we go with both watermarks checks removed and hopefully all the
compaction related tracepoints enabled:
echo 1 > /debug/tracing/events/compaction/enable
echo 1 > /debug/tracing/events/migrate/mm_migrate_pages/enable
The trace shows only 4 direct compaction attempts with order=2. The rest
is order=9, i.e. THP, which has little chances of success under such
pressure, and thus those failures and defers. The few order=2 attempts
appear all successful (defer_reset is called).
So it seems your system is mostly fine with just reclaim, and there's
little need for order-2 compaction, and that's also why you can't
reproduce the OOMs. So I'm afraid we'll learn nothing here, and looks
like Hugh will have to try those watermark check adjustments/removals
and/or provide the same kind of trace.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>