On Tue, 15 Nov 2011, Mel Gorman wrote: > Adding sync here could obviously be implemented although it may > require both always-sync and madvise-sync. Alternatively, something > like an options file could be created to create a bitmap similar to > what ftrace does. Whatever the mechanism, it exposes the fact that > "sync compaction" is used. If that turns out to be not enough, then > you may want to add other steps like aggressively reclaiming memory > which also potentially may need to be controlled via the sysfs file > and this is the slippery slope. > So what's being proposed here in this patch is the fifth time this line has been changed and its always been switched between true and !(gfp_mask & __GFP_NO_KSWAPD). Instead of changing it every few months, I'd suggest that we tie the semantics of the tunable directly to sync_compaction since we're primarily targeting thp hugepages with this change anyway for the "always" case. Comments? diff --git a/Documentation/vm/transhuge.txt b/Documentation/vm/transhuge.txt --- a/Documentation/vm/transhuge.txt +++ b/Documentation/vm/transhuge.txt @@ -116,6 +116,13 @@ echo always >/sys/kernel/mm/transparent_hugepage/defrag echo madvise >/sys/kernel/mm/transparent_hugepage/defrag echo never >/sys/kernel/mm/transparent_hugepage/defrag +If defrag is set to "always", then all hugepage allocations also attempt +synchronous memory compaction which makes the allocation as aggressive +as possible. The overhead of attempting to allocate the hugepage is +considered acceptable because of the longterm benefits of the hugepage +itself at runtime. If the VM should fallback to using regular pages +instead, then you should use "madvise" or "never". + khugepaged will be automatically started when transparent_hugepage/enabled is set to "always" or "madvise, and it'll be automatically shutdown if it's set to "never". diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2168,7 +2168,17 @@ rebalance: sync_migration); if (page) goto got_pg; - sync_migration = true; + + /* + * Do not use synchronous migration for transparent hugepages unless + * defragmentation is always attempted for such allocations since it + * can stall in writeback, which is far worse than simply failing to + * promote a page. Otherwise, we really do want a hugepage and are as + * aggressive as possible to allocate it. + */ + sync_migration = !(gfp_mask & __GFP_NO_KSWAPD) || + (transparent_hugepage_flags & + (1 << TRANSPARENT_HUGEPAGE_DEFRAG_FLAG)); /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>