> Still for "-mips2" the code is not exactly perfect: I'm guessing that gas is only doing one pass. When it first looks at the first load, the nop is necessary. When it later moves the second load into the branch delay slot, it doesn't go back and check to see if the nop after the first load is still necessary. To get this perfect, we would have to add global optimization support to gas, so that it considered all nop insertions and branch delay slot filling all at the same time, and iterated until it got the best code. I think it is pointless to do this kind of stuff in an assembler when we already have an optimizing compiler that already has infrastructure to do this kind of stuff. Jim