A question about gcc's optimization

jfdong@xxxxxxxxx · Fri, 5 Nov 2004 13:16:13 +0800 (CST)

How does gcc make optimization about fuction unit hazard and read after
write hazard on a MIPS machine?

I write a simple program:
----------------------
void main (void)
{
   int i,s1,s2;

   s1=0;
   s2=1;

   for (i=0;i<100;i++){
     s1=s1+i;
     s2=s2*i;
   }
.....
}
----------------------
use -O2 to compile it, then get the assembly codes like this:

[   2] 0x10001350:  27 bd ff d0  addiu sp,sp,-48
[   2] 0x10001354:  ff bf 00 20  sd ra,32(sp)         // RAW hazard <1>
[   2] 0x10001358:  ff bc 00 18  sd gp,24(sp)
[   2] 0x1000135c:  ff b0 00 10  sd s0,16(sp)         // These three `sd'
use the same function unit
[   2] 0x10001360:  3c 01 00 02  lui at,2
[   2] 0x10001364:  24 21 b4 24  addiu at,at,-19420   // RAW hazard <2>
[   2] 0x10001368:  00 39 e0 2d  daddu gp,at,t9
[   5] 0x1000136c:  00 00 28 25  move a1,zero
   6] 0x10001370:  24 10 00 01  li s0,1
[   8] 0x10001374:  00 00 18 25  move v1,zero         // RAW hazard <3>
[  10] 0x10001378:  02 03 00 18  mult s0,v1
[  10] 0x1000137c:  00 00 80 12  mflo s0
[  10] 0x10001380:  00 00 00 00  nop
[  10] 0x10001384:  00 00 00 00  nop
[   9] 0x10001388:  00 a3 28 21  addu a1,a1,v1
[   8] 0x1000138c:  24 63 00 01  addiu v1,v1,1
[   8] 0x10001390:  28 62 00 64  slti v0,v1,100
[   8] 0x10001394:  14 40 ff f8  bne v0,zero,0x10001378

There are two questions:

1. When we use a superscalar cpu, the above three `sd's are clustered
together and they will be issued at the same time.  But why doesn't gcc
schedule them apart ?

2. There are three RAW hazards in the assembly codes.  Why doesn't gcc
schedule them far from each other enough to solve the hazards ?

Thank you!

-Junfeng Dong

P.S.: I haven't subscribed to this mailing list, so please reply my mail
directly to my mailbox, thank you.