Consider the following test case: Test Case Reference: -------------------- int siVect[40] ; int siCoeff[40] ; int siSumofDotProduct ; int siIndex1, siIndex2 ; vTestMultipleArrayAccessWithDifferentLoopIndex() { for (siIndex1=0 ; siIndex1<40 ; siIndex1++) { siSumofDotProduct += siVect[siIndex1] * siCoeff[siIndex2]; siIndex2++; } } Assembly generated by bfin port (GCC 4.4.0 --target=bfin-elf) (-Os): -------------------------------------------------------------------- P1 = 41 (X); LSETUP (.L3, .L6) LC1 = P1; <---- (1) jump.s .L2; <---- (2) .L3: R2 = [P0++]; R3 = [P5++]; R2 *= R3; R0 += 1; R1 = R1 + R2; .L2: R2 = P2; .L6: P2 += 1; Here, marked instruction (2) "jump.s" is wrongly generated along with hardware loop. My hypothesis: ---------------- Consider the following snip of code that will get executed while doing target dependent reorganization of loops: Reference: gcc4.4.0/gcc/config/bfin/bfin.c Function: bfin_optimize_loop ---snip--- : bb = loop->tail; last_insn = PREV_INSN (loop->loop_end); while (1) { int bbno = bb->index; for (; last_insn != PREV_INSN (BB_HEAD (bb)); last_insn = PREV_INSN (last_insn)) if (INSN_P (last_insn)) break; if (last_insn != PREV_INSN (BB_HEAD (bb))) break; if (single_pred_p (bb) && single_pred (bb) != ENTRY_BLOCK_PTR) { bb = single_pred (bb); last_insn = BB_END (bb); continue; } else { last_insn = NULL_RTX; break; } } if (!last_insn) { if (dump_file) fprintf (dump_file, ";; loop %d has no last instruction\n", loop->loop_no); goto bad_loop; } ---snip--- The above 'while' loop will be executed for the block containing loop.It traverses the list of instructions within a loop from bottom to top. If it finds any instruction of INSN type, it breaks i.e. it assumes that the loop is valid for replacement. Now, consider the RTL dump (using -fdump-rtl-all option, file with extension .alignment) with -Os option: ---snip--- (jump_insn 112 86 113 2 /home/meena/Desktop/test.c:8 (set (pc) (label_ref 57)) -1 (nil)) (barrier 113 112 62) (code_label 62 113 49 3 3 "" [1 uses]) (note 49 62 50 3 [bb 3] NOTE_INSN_BASIC_BLOCK) (insn 50 49 51 3 /home/meena/Desktop/test.c:10 (set (reg:SI 2 R2 [100]) (mem/s:SI (post_inc:SI (reg:SI 8 P0 [orig:88 ivtmp.35 ] [88])) [2 siCoeff S4 A32])) 14 {*movsi_insn} (expr_list:REG_INC (reg:SI 8 P0 [orig:88 ivtmp.35 ] [88]) (nil))) (insn 51 50 52 3 /home/meena/Desktop/test.c:10 (set (reg:SI 3 R3 [102]) (mem/s:SI (post_inc:SI (reg:SI 13 P5 [orig:89 ivtmp.31 ] [89])) [2 siVect S4 A32])) 14 {*movsi_insn} (expr_list:REG_INC (reg:SI 13 P5 [orig:89 ivtmp.31 ] [89]) (nil))) (insn 52 51 53 3 /home/meena/Desktop/test.c:10 (set (reg:SI 2 R2 [100]) (mult:SI (reg:SI 2 R2 [100]) (reg:SI 3 R3 [102]))) 75 {mulsi3} (expr_list:REG_DEAD (reg:SI 3 R3 [102]) (nil))) (insn 53 52 54 3 /home/meena/Desktop/test.c:10 (set (reg:SI 1 R1 [orig:94 siSumofDotProduct_lsm.17 ] [94]) (plus:SI (reg:SI 1 R1 [orig:94 siSumofDotProduct_lsm.17 ] [94]) (reg:SI 2 R2 [100]))) 45 {addsi3} (expr_list:REG_DEAD (reg:SI 2 R2 [100]) (nil))) (insn 54 53 57 3 /home/meena/Desktop/test.c:10 (set (reg:SI 0 R0 [orig:90 ivtmp.25 ] [90]) (plus:SI (reg:SI 0 R0 [orig:90 ivtmp.25 ] [90]) (const_int 1 [0x1]))) 45 {addsi3} (nil)) (code_label 57 54 58 4 2 "" [1 uses]) (note 58 57 60 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (insn 60 58 61 4 /home/meena/Desktop/test.c:10 (set (reg:SI 2 R2 [orig:87 siIndex2_lsm.37 ] [87]) (reg:SI 10 P2 [orig:91 ivtmp.22 ] [91])) 14 {*movsi_insn} (nil)) (insn 61 60 85 4 /home/meena/Desktop/test.c:10 (set (reg:SI 10 P2 [orig:91 ivtmp.22 ] [91]) (plus:SI (reg:SI 10 P2 [orig:91 ivtmp.22 ] [91]) (const_int 1 [0x1]))) 45 {addsi3} (nil)) (jump_insn 85 61 65 4 /home/meena/Desktop/test.c:8 (parallel [ (set (pc) (if_then_else (ne (reg:SI 9 P1 [106]) (const_int 1 [0x1])) (label_ref 62) (pc))) (set (reg:SI 9 P1 [106]) (plus:SI (reg:SI 9 P1 [106]) (const_int -1 [0xffffffff]))) (unspec [ (const_int 0 [0x0]) ] 10) (clobber (scratch:SI)) ]) 89 {loop_end} (expr_list:REG_BR_PROB (const_int 9100 [0x238c]) (nil))) ---snip--- Please note, loop body is present in the block '3' marked by (note 49 62 50 3 [bb 3] NOTE_INSN_BASIC_BLOCK) Whereas loop end instruction (jump_insn 85) is a part of separate block block 4 marked by (note 58 57 60 4 [bb 4] NOTE_INSN_BASIC_BLOCK) This block has two predecessor "bb 2" and "bb 3". As per my understanding, that the above snip of code work correctly only when the loop body and loop end instruction are part of same basic block and the block should have only one predecessor. But these conditions are getting failed in case of -Os option. Thus, the loop should not be replaced by hardware loop in this case. But due to presence of below instructions (of type INSN_P) in basic block '4', (insn 60 58 61 4 /home/meena/Desktop/test.c:10 (set (reg:SI 2 R2 [orig:87 siIndex2_lsm.37 ] [87]) (reg:SI 10 P2 [orig:91 ivtmp.22 ] [91])) 14 {*movsi_insn} (nil)) (insn 61 60 85 4 /home/meena/Desktop/test.c:10 (set (reg:SI 10 P2 [orig:91 ivtmp.22 ] [91]) (plus:SI (reg:SI 10 P2 [orig:91 ivtmp.22 ] [91]) (const_int 1 [0x1]))) 45 {addsi3} (nil)) The 'while' loop is getting break in between and wrongly generating the code for hardware loop. Please verify my understanding. Thanks and Regards, Meena