How does gcc make optimization about fuction unit hazard and read after write hazard on a MIPS machine? I write a simple program: ---------------------- void main (void) { int i,s1,s2; s1=0; s2=1; for (i=0;i<100;i++){ s1=s1+i; s2=s2*i; } ..... } ---------------------- use -O2 to compile it, then get the assembly codes like this: [ 2] 0x10001350: 27 bd ff d0 addiu sp,sp,-48 [ 2] 0x10001354: ff bf 00 20 sd ra,32(sp) // RAW hazard <1> [ 2] 0x10001358: ff bc 00 18 sd gp,24(sp) [ 2] 0x1000135c: ff b0 00 10 sd s0,16(sp) // These three `sd' use the same function unit [ 2] 0x10001360: 3c 01 00 02 lui at,2 [ 2] 0x10001364: 24 21 b4 24 addiu at,at,-19420 // RAW hazard <2> [ 2] 0x10001368: 00 39 e0 2d daddu gp,at,t9 [ 5] 0x1000136c: 00 00 28 25 move a1,zero 6] 0x10001370: 24 10 00 01 li s0,1 [ 8] 0x10001374: 00 00 18 25 move v1,zero // RAW hazard <3> [ 10] 0x10001378: 02 03 00 18 mult s0,v1 [ 10] 0x1000137c: 00 00 80 12 mflo s0 [ 10] 0x10001380: 00 00 00 00 nop [ 10] 0x10001384: 00 00 00 00 nop [ 9] 0x10001388: 00 a3 28 21 addu a1,a1,v1 [ 8] 0x1000138c: 24 63 00 01 addiu v1,v1,1 [ 8] 0x10001390: 28 62 00 64 slti v0,v1,100 [ 8] 0x10001394: 14 40 ff f8 bne v0,zero,0x10001378 There are two questions: 1. When we use a superscalar cpu, the above three `sd's are clustered together and they will be issued at the same time. But why doesn't gcc schedule them apart ? 2. There are three RAW hazards in the assembly codes. Why doesn't gcc schedule them far from each other enough to solve the hazards ? Thank you! -Junfeng Dong P.S.: I haven't subscribed to this mailing list, so please reply my mail directly to my mailbox, thank you.