Hi,
There has been a discussion going on about asm volatile statement
reordering, where there has been surprising code generated for the ARM
with certain code and compiler flags. The discussion has been in the
comp.arch.embedded Usenet group, the gcc-arm-embedded project (the most
common source of gcc for embedded ARM devices, both for individuals and
for companies), and some other related projects. I believe it is time
to ask the gcc folks too!
This is a link to the gcc-arm-embedded issue, which is perhaps the most
complete version.
<https://bugs.launchpad.net/gcc-arm-embedded/+bug/1722849?comments=all>
In summary, the key code looks like this:
uint32_t status;
/* save the PRIMASK into 'status' */
__asm volatile ("mrs %0,PRIMASK" : "=r" (status) :: );
/* set PRIMASK to disable interrupts */
__asm volatile ("cpsid i");
foo(); /* call a function */
/* restore PRIMASK from 'status' */
__asm volatile ("msr PRIMASK,%0" :: "r" (status) : );
The devices interrupt priority mask is saved, interrupts are disabled,
then an external function is called in an uninterruptable critical
section, and finally interrupts are restored.
With a particular choice of surrounding code (see the link above for
details), -O1 optimisation (but not -O2 or -Os), -cpu=cortex-m0plus or
-cpu=cortex-m0 (but not m3 or m4), the compiler generates unexpected
code with the ordering effectively:
uint32_t status;
/* set PRIMASK to disable interrupts */
__asm volatile ("cpsid i");
foo(); /* call a function */
/* save the PRIMASK into 'status' */
__asm volatile ("mrs %0,PRIMASK" : "=r" (status) :: );
/* restore PRIMASK from 'status' */
__asm volatile ("msr PRIMASK,%0" :: "r" (status) : );
Obviously, this is very different from the programmer's intention.
It seems there is a fair bit of luck involved regarding the surrounding
code and the compiler flags - perhaps it requires a specific amount of
register pressure before the compiler decides the re-arrangement is a
good idea. On the other hand, rough testing suggests it is fairly
independent of compiler version.
Much of this boils down to the question of when gcc is allowed to
re-order "asm volatile" statements, with respect to other "asm volatile"
statements, volatile memory accesses, and unknown functions (which may
contain observable behaviour).
My testing suggests that gcc will re-order "asm volatile" statements
that have an output, such as the "save the PRIMASK into status"
statement, but it will /not/ re-order "asm volatile" statements that
have no outputs.
Is that correct?
Is that the intended behaviour of "asm volatile" ?
If so, is that a good design choice or should it be changed?
Could the documentation in the gcc web page be improved?
Or is this a bug in gcc, and the statements should not have been re-ordered?
David