Hi,
i am using apple gcc4.0.1 optimizations to compile VTK shared
libraries optimized for speed on an apple G5 single processor 1.6GHz
machine.
c++ flags in ccmake . (the build tool for VTK) are
-fast -maltivec -fPIC -ftree-vectorize
the -fast flag is a shortcut for a bunch of flags including -falign-
loops=16
build works fine,
then I run my VTK application and sample it with apples Shark tool,
I see that one loop running from a VTK lib
which takes up much processor time
has not been aligned to 16 byte boundary
and shark tells me that I should use the -falign-loops=16,
which I am since I am using -fast
shark also tells me this loop contains a singele-precision floating
point computation that could be speeded up using altivec
-fast also turns on -maltivec .
Do any gurus know why this loop is not being aligned, and still
slowing down execution of the executable?
cheers
Dan
Dr. Daniel James White BSc. (Hons.) PhD
Bioimaging Coordinator
Nanoscience Centre and Department of Biological and Environmental
Sciences
Division of Molecular Recognition
Ambiotica C242
PO Box 35
University of Jyväskylä
Jyväskylä
FIN 40014
Finland
+358 14 260 4183 (work)
+358 468102840 (mobile)
http://www.bioimagexd.org
http://www.chalkie.org.uk
dan@xxxxxxxxxxxxxx
white@xxxxxxxxx