Re: [PATCH 00 of 41] Transparent Hugepage Support #17

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Avi Kivity <avi@xxxxxxxxxx> wrote:

> > I think what would be needed is some non-virtualization speedup example of 
> > a 'non-special' workload, running on the native/host kernel. 'sort' is an 
> > interesting usecase - could it be patched to use hugepages if it has to 
> > sort through lots of data?
> 
> In fact it works well unpatched, the 6% I measured was with the system sort.

Yes - but you intentionally sorted something large - the question is, how big 
is the slowdown with small sizes (if there's a slowdown), where is the 
break-even point (if any)?

> > [...]
> >
> > Something like GIMP calculations would be a lot more representative of the 
> > speedup potential. Is it possible to run the GIMP with transparent 
> > hugepages enabled for it?
> 
> I thought of it, but raster work is too regular so speculative execution 
> should hide the tlb fill latency.  It's also easy to code in a way which 
> hides cache effects (no idea if it is actually coded that way).  Sort showed 
> a speedup since it defeats branch prediction and thus the processor cannot 
> pipeline the loop.

Would be nice to try because there's a lot of transformations within Gimp - 
and Gimp can be scripted. It's also a test for negatives: if there is an 
across-the-board _lack_ of speedups, it shows that it's not really general 
purpose but more specialistic.

If the optimization is specialistic, then that's somewhat of an argument 
against automatic/transparent handling. (even though even if the beneficiaries 
turn out to be only special workloads then transparency still has advantages.)

> I thought ray tracers with large scenes should show a nice speedup, but 
> setting this up is beyond my capabilities.

Oh, this tickled some memories: x264 compressed encoding can be very cache and 
TLB intense. Something like the encoding of a 350 MB video file:

  wget http://media.xiph.org/video/derf/y4m/soccer_4cif.y4m       # NOTE: 350 MB!
  x264 --crf 20 --quiet soccer_4cif.y4m -o /dev/null --threads 4

would be another thing worth trying with transparent-hugetlb enabled.

(i've Cc:-ed x264 benchmarking experts - in case i missed something)

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]