-falign-loops=16 on apple gcc still gives loops not aligned to 16 byte address boundaries -why?

Dan White <dan@xxxxxxxxxxxxxx> · Fri, 17 Feb 2006 16:50:19 +0200

Hi,

i am using apple gcc4.0.1 optimizations to compile VTK shared  
libraries optimized  for speed on an apple G5 single processor 1.6GHz  
machine.

c++ flags in ccmake .  (the build tool for VTK) are
-fast -maltivec -fPIC -ftree-vectorize

the -fast flag is a shortcut for a bunch of flags including -falign- 
loops=16

build works fine,
then I run my VTK application and sample it with apples Shark tool,
I see that one loop running from a VTK lib
which takes up much processor time
has not been aligned to 16 byte boundary
and shark tells me that I should use the -falign-loops=16,
which I am since I am using -fast

shark also tells me this loop contains a singele-precision floating  
point computation that could be speeded up using altivec
-fast also turns on -maltivec .

Do any gurus know why this loop is not being aligned, and still  
slowing down execution of the executable?

cheers

Dan

Dr. Daniel James White BSc. (Hons.) PhD
Bioimaging Coordinator
Nanoscience Centre and Department of Biological and Environmental  
Sciences
Division of Molecular Recognition
Ambiotica C242
PO Box 35
University of Jyväskylä
Jyväskylä
FIN 40014
Finland

+358 14 260 4183 (work)
+358 468102840 (mobile)
http://www.bioimagexd.org
http://www.chalkie.org.uk
dan@xxxxxxxxxxxxxx
white@xxxxxxxxx