Intel sse2 extensions

don fisher <dfisher@xxxxxxxxxxxxxx> · Tue, 09 Sep 2003 16:11:39 -0700

Hello,

I am currently running Redhat 9 with gcc version 3.2.2. I am able to 
see the use of xmm registers in the listing, but never in packed mode. 
I tried

float slope_lut_h, slope_lut_v, temp __attribute__ ((vector_size (16)));
temp = slope_lut_h + slope_lut_v;

only to get an illegal operands error message. Also, how does one 
assign a value (the same value 4 times) to such a float vector?

I also tried

__builtin_prefetch (&delta_ptr[i+16], 1, 1);

with the result Illegal instruction (core dumped). This is a P4, and 
may not support the cache prefetch. Does gcc have a function to return 
the processor capability mask?

I have used

typedef int v4sf __attribute__ ((mode(V4SF)));
v4sf matrix[256][256] __attribute__ ((aligned (16)));

as an example of my array declarations.

Does the 3.2.2 compiler support the sse packed float instructions? Is 
there a site with some example code I could examine.

Thanks in advance for any advice or pointers. I am hesitant to join 
the gcc list, because I know little about writing compilers. I do like 
to push the envelope in applications code, hopefully without using 
assembler codes.

Thanks again,
don

--
-------------------------------------------------------------------
|    Don Fisher				  dfisher@xxxxxxxxxxxxxx  |
|    Steward Observatory		  			  |
|    933 N. Cherry Ave.    		  VOICE: (520)621-7647	  |
|    University of Arizona		  FAX:   (520)621-9843    |
|    Tucson, AZ  85721                				  |
-------------------------------------------------------------------