On Thu, Aug 9, 2012 at 2:43 AM, Shweta Gupta <er.shwetagupta.edu@xxxxxxxxx> wrote: > But I see that instruction 19 gets a new number in lreg dump, and gets > placed just before the 21 instruction: > > (insn 18 82 81 2 (set (mem:SI (post_inc:PSI (reg/f:PSI 66)) [0 S4 A32]) > (reg:SI 67 [ sobj ])) 10 {*movsi} (nil) > (expr_list:REG_DEAD (reg:SI 67 [ sobj ]) > (expr_list:REG_INC (reg/f:PSI 66) > (nil)))) > > (insn 81 18 21 2 (set (reg:SI 68 [ sobj+4 ]) > (mem/s:SI (plus:PSI (reg/f:PSI 22 ar6) > (const_int -40 [0xffffffd8])) [4 sobj+4 S4 A32])) 10 > {*movsi} (nil) > (expr_list:REG_EQUIV (mem/s:SI (plus:PSI (reg/f:PSI 22 ar6) > (const_int -40 [0xffffffd8])) [4 sobj+4 S4 A32]) > (nil))) > > (insn 21 81 80 2 (set (mem:SI (post_inc:PSI (reg/f:PSI 66)) [0 S4 A32]) > (reg:SI 68 [ sobj+4 ])) 10 {*movsi} (nil) > (expr_list:REG_DEAD (reg:SI 68 [ sobj+4 ]) > (expr_list:REG_INC (reg/f:PSI 66) > > As we see above, instruction 81 is the same as instruction 19 (of sched1), > and has got placed just above instruction 21, causing a stall. > > I am not clear why this happened in lreg? I suspected that it could a part > of some reg-move transformation in register allocation. So, tried option > -fnoreg-move also, but it has no effect. Oh, this is an old problem with the local register allocator. It sometimes moves register loads to decrease the number of instructions over which the register is live. In effect, it undoes the first scheduling pass. I see that you are using GCC 4.2, which is fairly old. I think it's the bit in local-alloc.c with this comment: /* Now scan all regs killed in an insn to see if any of them are registers only used that once. If so, see if we can replace the reference with the equivalent form. If we can, delete the initializing reference and this register will go away. If we can't replace the reference, and the initializing reference is within the same loop (or in an inner loop), then move the register initialization just before the use, so that they are in the same basic block. */ That code is long gone--it was removed in GCC 4.4 when the IRA register allocator went in. So certainly one step you could try would be upgrading to a newer version of GCC. Other than that, the thing to pay attention to is the first number after the [ in the RTL dump. That is the memory set alias number. GCC won't move loads and stores to conflicting alias sets across each other. Alias set 0 conflicts with everything. So you could look into why some of the loads and stores have alias set 0. Ian