Re: [LSF/MM TOPIC] wmark based pro-active compaction

Michal Hocko <mhocko@xxxxxxxx> · Thu, 5 Jan 2017 11:27:22 +0100

On Thu 05-01-17 10:53:59, Vlastimil Babka wrote:
> [CC Joonsoo and Johannes]
> 
> On 12/30/2016 03:06 PM, Mel Gorman wrote:
> > On Fri, Dec 30, 2016 at 02:14:12PM +0100, Michal Hocko wrote:
> >> Hi,
> >> I didn't originally want to send this proposal because Vlastimil is
> >> planning to do some work in this area so I've expected him to send
> >> something similar. But the recent discussion about the THP defrag
> >> options pushed me to send out my thoughts.
> 
> No problem.
> 
> >> So what is the problem? The demand for high order pages is growing and
> >> that seems to be the general trend. The problem is that while they can
> >> bring performance benefit they can get be really expensive to allocate
> >> especially when we enter the direct compaction. So we really want to
> >> prevent from expensive path and defer as much as possible to the
> >> background. A huge step forward was kcompactd introduced by Vlastimil.
> >> We are still not there yet though, because it might be already quite
> >> late when we wakeup_kcompactd(). The memory might be already fragmented
> >> when we hit there.
> 
> Right.
> 
> >> Moreover we do not have any way to actually tell
> >> which orders we do care about.
> 
> Who is "we" here? The system admin?

yes

> >> Therefore I believe we need a watermark based pro-active compaction
> >> which would keep the background compaction busy as long as we have
> >> less pages of the configured order.
> 
> Again, configured by what, admin? I would rather try to avoid tunables
> here, if possible. While THP is quite well known example with stable
> order, the pressure for other orders is rather implementation specific
> (drivers, SLAB/SLUB) and may change with kernel versions (e.g. virtually
> mapped stacks, although that example is about non-costly order). Would
> the admin be expected to study the implementation to know which orders
> are needed, or react to page allocation failure reports? Neither sounds
> nice.

That is a good question but I expect that there are more users than THP
which use stable orders. E.g. networking stack tends to depend on the
packet size. A tracepoint with some histogram output would tell us what
is the requested orders distribution.

> >> kcompactd should wake up
> >> periodically, I think, and check for the status so that we can catch
> >> the fragmentation before we get low on memory.
> >> The interface could look something like:
> >> /proc/sys/vm/compact_wmark
> >> time_period order count
> 
> IMHO it would be better if the system could auto-tune this, e.g. by
> counting high-order alloc failures/needs for direct compaction per order
> between wakeups, and trying to bring them to zero.

auto-tunning is usually preferable I am just wondering how the admin can
tell what is still the system load price he is willing to pay. I suspect
we will see growing number of opportunistic high order requests over
time and  auto tunning shouldn't try to accomodate with it without
any bounds. There is still some cost/benefit to be evaluated from the
system level point of view which I am afraid is hard to achive from the
kcompactd POV.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>