Re: [PATCH 00 of 41] Transparent Hugepage Support #17

Chris Mason <chris.mason@xxxxxxxxxx> · Tue, 6 Apr 2010 07:35:42 -0400

On Tue, Apr 06, 2010 at 09:30:28AM +0100, Mel Gorman wrote:
> On Tue, Apr 06, 2010 at 12:18:24AM +0300, Avi Kivity wrote:
> > On 04/06/2010 12:01 AM, Chris Mason wrote:
> >> On Mon, Apr 05, 2010 at 01:32:21PM -0700, Linus Torvalds wrote:
> >>    
> >>>
> >>> On Mon, 5 Apr 2010, Pekka Enberg wrote:
> >>>      
> >>>> AFAIK, most modern GCs split memory in young and old generation
> >>>> "zones" and _copy_ surviving objects from the former to the latter if
> >>>> their lifetime exceeds some threshold. The JVM keeps scanning the
> >>>> smaller young generation very aggressively which causes TLB pressure
> >>>> and scans the larger old generation less often.
> >>>>        
> >>> .. my only input to this is: numbers talk, bullsh*t walks.
> >>>
> >>> I'm not interested in micro-benchmarks, either. I can show infinite TLB
> >>> walk improvement in a microbenchmark.
> >>>      
> >> Ok, I'll bite.  I should be able to get some database workloads with
> >> hugepages, transparent hugepages, and without any hugepages at all.
> >>    
> >
> > Please run them in conjunction with Mel Gorman's memory compaction,  
> > otherwise fragmentation may prevent huge pages from being instantiated.
> >
> 
> Strictly speaking, compaction is not necessary to allocate huge pages.
> What compaction gets you is
> 
>   o Lower latency and cost of huge page allocation
>   o Works on swapless systems
> 
> What is important is that you run
> hugeadm --set-recommended-min_free_kbytes
> from the libhugetlbfs 2.8 package early in boot so that
> anti-fragmentation is doing as good as job as possible.

Great, I'll make sure to do this.

> If one is very
> curious, use the mm_page_alloc_extfrag to trace how often severe
> fragmentation-related events occur under default settings and with
> min_free_kbytes set properly.
> 
> Without the compaction patches, allocating huge pages will be occasionally
> *very* expensive as a large number of pages will need to be reclaimed.
> Most likely sympton is trashing while the database starts up. Allocation
> success rates will also be lower when under heavy load.
> 
> Running make -j16 at the same time is unlikely to make much of a
> difference from a hugepage allocation point of view. The performance
> figures will vary significantly of course as make competes with the
> database for CPU time and other resources.

Heh, Linus did actually say to run them concurrently with make -j16, but
I read it as make -j16 before the database run.  My goal will be to
fragment the ram, then get a db in ram and see how fast it all goes.

Fragmenting memory during the run is only interesting to test compaction, I'd
throw out the resulting db benchmark numbers and only count the
number of transparent hugepages we were able to allocate.

> 
> Finally, benchmarking with databases is not new as such -
> http://lwn.net/Articles/378641/ . This was on fairly simple hardware
> though as I didn't have access to hardware more suitable for database
> workloads. If you are running with transparent huge pages though, be
> sure to double check that huge pages are actually being used
> transparently.

Will do.  It'll take me a few days to get the machines setup and a
baseline measurement.

-chris

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>