On Tue, Jan 10, 2017 at 04:15:27PM -0800, David Rientjes wrote: > There is no thp defrag option that currently allows MADV_HUGEPAGE regions > to do direct compaction and reclaim while all other thp allocations simply > trigger kswapd and kcompactd in the background and fail immediately. > > The "defer" setting simply triggers background reclaim and compaction for > all regions, regardless of MADV_HUGEPAGE, which makes it unusable for our > userspace where MADV_HUGEPAGE is being used to indicate the application is > willing to wait for work for thp memory to be available. > > The "madvise" setting will do direct compaction and reclaim for these > MADV_HUGEPAGE regions, but does not trigger kswapd and kcompactd in the > background for anybody else. > > For reasonable usage, there needs to be a mesh between the two options. > This patch introduces a fifth mode, "defer+madvise", that will do direct > reclaim and compaction for MADV_HUGEPAGE regions and trigger background > reclaim and compaction for everybody else so that hugepages may be > available in the near future. > > A proposal to allow direct reclaim and compaction for MADV_HUGEPAGE > regions as part of the "defer" mode, making it a very powerful setting and > avoids breaking userspace, was offered: > http://marc.info/?t=148236612700003. This additional mode is a > compromise. > > A second proposal to allow both "defer" and "madvise" to be selected at > the same time was also offered: http://marc.info/?t=148357345300001. > This is possible, but there was a concern that it might break existing > userspaces the parse the output of the defrag mode, so the fifth option > was introduced instead. > > This patch also cleans up the helper function for storing to "enabled" > and "defrag" since the former supports three modes while the latter > supports five and triple_flag_store() was getting unnecessarily messy. > > Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx> > --- > v2: uses new naming suggested by Vlastimil > (defer+madvise order looks better in > "... defer defer+madvise madvise ...") > > v1 was acked by Mel, and it probably could have been preserved but it was > removed in case there is an issue with the name change. > There isn't Acked-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Thanks. -- Mel Gorman SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>