Jamal, There is a possible cache line read-after-write pseudo-dependency that, along with the code alignment in terms of the instruction pair doublewords, may do something weird to the sb1250 pipeline. Just my guess. Have a great weekend, Adam -----Original Message----- From: jamal [mailto:hadi@xxxxxxxxxx] Sent: Friday, May 28, 2004 12:37 PM To: linux-mips@xxxxxxxxxxxxxx Cc: sibyte-users@xxxxxxxxxxxx Subject: weird sb1250 behavior found some very strange behavior with sb1250. Gcc 3.2.3 with sibyte mods. Running Linux 2.4.21 with whatever mods off sibyte. Testcase: sending a large amount of traffic -->eth0-->someprocessing-->eth1 given the nature of processing, say i was getting 100Kpps throughput. Now i fire a very basic program that has just loops and forever sums up two numbers. --- 1 #include <stdlib.h> 2 3 int main () 4 { 5 int a = 1; 6 int b = 2; 7 int c = 0; 8 // int c; 9 while (1) { 10 c = a + b; 11 } 12 } -------- I see very little drop in throughput - probably around 0.01%. Now comment line 7 then uncomment line 8. Hallelujah. Perfomance drops to about 100pps. Thats about a factor of 1000 down! Interesting thing is if you add a nop (__asm__ __volatile__("nop");) in the second version just before the while loop, we get back the same performance as in the earlier version. Apologies in advance for attaching objdumps (since there maybe folks who dont have access to the sibyte tools) 1) while-init-dis is for case 1 where c is initialized 2) while-noinit-dis is for case 2 where c is not initialize 3) while-nop-dis is for case 3 when you have nop thrown in. cheers, jamal