On Mon, Mar 21, 2011 at 7:49 AM, Matthias Kretz <kretz@xxxxxxxxxxxxxxxxxxxxxxxx> wrote: > Hi, > > On Monday 21 March 2011 15:23:02 Matthias Kretz wrote: >> I tested the GCC 4.6.0 RC on Intel systems with good success so far. Now I >> tested on an AMD Magny-Cours using the -march=barcelona flag and gcc >> translated _mm_store_pd/s calls in the code to streaming stores in the >> resulting binary. >> >> Where does this "optimization" come from and how can I disable it? This >> doesn't make much sense on a working set that fits into the cache... >> >> Is this intended behavior or a bug? > > Additional info: If I add -fno-prefetch-loop-arrays I get normal stores as > expected. I don't consider this a solution, though. > > Regards, > Matthias > > -- > Dipl.-Phys. Matthias Kretz > http://compeng.uni-frankfurt.de/?mkretz > Do you mean _mm_stream_pd/s? I think store will still take your values to cache... Brian