Parmenides <mobile.parmenides@xxxxxxxxx> writes: > For the purpose of understanding some gcc's features, without ideas of > details underlying gcc, I have to code some examples in C and compile > them into assembly code, then observe them to get some ideas. Memory > values caching in registers is one optimization taken by gcc, > reordering instructions is another. A "memory" clobber in an inline > assembly may have influence on the both. I have coded an example in C > to try to understand the former. > > int s = 0; > int tst(int lim) > { > int i; > > for (i = 1; i < lim; i++) > s = s + i; > > asm volatile( > "nop" > ); > > s = s * 10; > > return s; > } > > To compile the C souce, the following command is excuted. > gcc -S -O tst.c > > The corresponding assembly code is as follows: > tst: > pushl %ebp > movl %esp, %ebp > movl 8(%ebp), %ecx > cmpl $1, %ecx > jle .L2 > movl s, %edx > movl $1, %eax > .L4: > addl %eax, %edx > incl %eax > cmpl %eax, %ecx > jne .L4 > movl %edx, s <--- After the loop, s is write back into memory. > .L2: > movl s, %eax <--- Before the evaluating 's = s * 10', s > is reload into register. > leal (%eax,%eax,4), %eax > addl %eax, %eax > movl %eax, s > popl %ebp > ret > > So, the "memory" clobber have prevented the optimization. But for the > latter case, namely reordering instructions, I can not obtain an > example like the above to illustrate how "memory" clobber prevent > reordering instructions. I don't know some circumstances under which > gcc will do reodering. Without them, I can not observe the effect of > the "memory" clobber. Instruction reordering is easier to observe on a machine other than the x86, one with long load latencies. Here is an example, though: int f (int *a, int *b, int c) { int i, j; for (i = 0; i < c; i++) { int a0, a1, a2, a3; asm ("nop" : "=r" (j) : "r" (i)); a0 = a[0]; a1 = a[1]; a2 = a[2]; a3 = a[3]; b[1] = a0; b[3] = a1; b[0] = a2; b[2] = a3; } return j; } When optimizing, the memory load instructions will be reordered to occur before the asm. At least, that's what I see with current mainline gcc on x86_64. This isn't a case of memory caching; it's reordering of the load instructions across the asm. Ian