Hi, A recent change broke CPU_DADDI_WORKAROUNDS support in memset.S, due to a delay-slot instruction expanding to multiple hardware operations for the affected configurations. The underlying cause is the excessive use of the `noreorder' assembly mode, while it is only needed in couple of places where either there is a data dependency between a branch and its delay slot instruction, or there is a section switch involved that would prevent automatic delay slot scheduling. These changes address both problems and for clarity, not to mix multiple conceptually separate changes and to make backporting easier I made them a small patch series. See individual change descriptions for details. This has been build-time and run-time verified with 32-bit and 64-bit DECstation configurations, build-time verified with big-endian and little-endian 64-bit SWARM configurations. Build-time verification was made by running `objdump -d arch/mips/lib/memset.o' with a pristine and and a patched build to make sure there has been no change in machine code generation, except for the delay-slot multiple instruction with the 64-bit CPU_DADDI_WORKAROUNDS DECstation configuration. Please apply. Maciej