Re: [PATCH 00 of 41] Transparent Hugepage Support #17

Andrea Arcangeli <aarcange@xxxxxxxxxx> · Mon, 12 Apr 2010 09:08:11 +0200

On Mon, Apr 12, 2010 at 04:09:31PM +1000, Nick Piggin wrote:
> One problem is that you need to keep a lot more memory free in order
> for it to be reasonably effective. Another thing is that the problem
> of fragmentation breakdown is not just a one-shot event that fills
> memory with pinned objects. It is a slow degredation.

set_recommended_min_free_kbytes seems to not be in function of ram
size, 60MB aren't such a big deal.

> Especially when you use something like SLUB as the memory allocator
> which requires higher order allocations for objects which are pinned
> in kernel memory.
> 
> Just running a few minutes of testing with a kernel compile in the
> background does not show the full picture. You really need a box that
> has been up for days running a proper workload before you are likely
> to see any breakdown.
> 
> I'm sure it's horrible for planning if the RDBMS or VM boxes gradually
> get slower after X days of uptime. It's better to have consistent
> performance really, for anything except pure benchmark setups.

All data I provided is very real, in addition to building a ton of
packages and running emerge on /usr/portage I've been running all my
real loads. Only problem I only run it for 1 day and half, but the
load I kept it under was significant (surely a lot bigger inode/dentry
load that any hypervisor usage would ever generate).

> Defrag is not futile in theory, you just have to either have a reserve
> of movable pages (and never allow pinned kernel pages in there), or
> you need to allocate pinned kernel memory in units of the chunk size
> goal (which just gives you different types of fragmentation problems)
> or you need to do non-linear kernel mappings so you can defrag pinned
> kernel memory (with *lots* of other problems of course). So you just
> have a lot of downsides.

That's what the kernelcore= option does no? Isn't that a good enough
math guarantee? Probably we should use it in hypervisor products just
in case, to be math-guaranted to never have to use VM migration as
fallback (but definitive) defrag algorithm.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>