On Wed, Jul 16, 2014 at 03:48:15PM +0200, Vlastimil Babka wrote: > Async compaction aborts when it detects zone lock contention or need_resched() > is true. David Rientjes has reported that in practice, most direct async > compactions for THP allocation abort due to need_resched(). This means that a > second direct compaction is never attempted, which might be OK for a page > fault, but khugepaged is intended to attempt a sync compaction in such case and > in these cases it won't. > > This patch replaces "bool contended" in compact_control with an int that > distinguieshes between aborting due to need_resched() and aborting due to lock > contention. This allows propagating the abort through all compaction functions > as before, but passing the abort reason up to __alloc_pages_slowpath() which > decides when to continue with direct reclaim and another compaction attempt. > > Another problem is that try_to_compact_pages() did not act upon the reported > contention (both need_resched() or lock contention) immediately and would > proceed with another zone from the zonelist. When need_resched() is true, that > means initializing another zone compaction, only to check again need_resched() > in isolate_migratepages() and aborting. For zone lock contention, the > unintended consequence is that the lock contended status reported back to the > allocator is detrmined from the last zone where compaction was attempted, which > is rather arbitrary. > > This patch fixes the problem in the following way: > - async compaction of a zone aborting due to need_resched() or fatal signal > pending means that further zones should not be tried. We report > COMPACT_CONTENDED_SCHED to the allocator. > - aborting zone compaction due to lock contention means we can still try > another zone, since it has different set of locks. We report back > COMPACT_CONTENDED_LOCK only if *all* zones where compaction was attempted, > it was aborted due to lock contention. > > As a result of these fixes, khugepaged will proceed with second sync compaction > as intended, when the preceding async compaction aborted due to need_resched(). > Page fault compactions aborting due to need_resched() will spare some cycles > previously wasted by initializing another zone compaction only to abort again. > Lock contention will be reported only when compaction in all zones aborted due > to lock contention, and therefore it's not a good idea to try again after > reclaim. > > Reported-by: David Rientjes <rientjes@xxxxxxxxxx> > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Cc: Mel Gorman <mgorman@xxxxxxx> > Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> > Cc: Michal Nazarewicz <mina86@xxxxxxxxxx> > Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > Cc: Christoph Lameter <cl@xxxxxxxxx> > Cc: Rik van Riel <riel@xxxxxxxxxx> Acked-by: Mel Gorman <mgorman@xxxxxxx> -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>