> Ralf wrote: >Have you tried to insert a large number of nops instead? My investigation suggests that a single extra nop is sufficient. I have also tried inserting extra nops before the cache routine to see if the relative alignment of the instructions with respect to the cacheline has an influence, but it has no effect. I am suspicious that if this occurs with the instruction following the loop then something odd might be occuring on every loop iteration as well. I might try adjusting the instructions in the loop to see if that has any effect. > Or preferably, >how about replacing the __restore_flags() in your example with the >following piece of inline assembler: > > __asm__ __volatile__("mtc0\t%0, $12" ::"r" (flags) : "memory"); I am happy that the current assembler code looks correct, but this change would make it simpler. Jon