Michael Kukat writes: > > i'm programming a lot of AVR firmware since a while, first in pure > assembler, as gcc 3.x didn't produce code i liked very much. But the > 4.1.1 i'm currently using produces quite good code if you sometimes > try around to get the best results (as in "i would write nearly the > same in assembler"). > > But now, i found one strange thing where i currently can't find a > workaround i really like. It's about a quite simple loop here: > > This is my perftest.c file: > #include <avr/io.h> > #include < inttypes.h> > > volatile uint8_t xx; > > void test() { > uint8_t ctr; > > ctr = 0; > do { > xx = ctr; > } while(++ctr < 64); > } > > quite simple. If i compile this like the following: > > avr-gcc -mmcu=atmega644 -O3 -S -o - perftest.c > > i see that ctr is used in 16bit: > > ldi r24,lo8(0) > ldi r25,hi8(0) > .L2: > sts xx,r24 > adiw r24,1 > cpi r24,64 > cpc r25,__zero_reg__ > brne .L2 > /* epilogue: frame size=0 */ > ret > > Now i tried around a bit and see the following situation: > > if i replace the "xx = ctr" by an __asm__("nop"), the counter is 8 > bit, as it should be. If i use ctr = 64 before the loop and do > while(--ctr);, the counter also is 8 bit (with xx = ctr in the loop). > All experiments trying < (uint8_t) 64 and so didn't work, also != > instead of < doesn't work, i can't find a "nice" way to force the > counter being 8 bit when using the value within the loop. I need that > loop in this way, that's why i don't use --ctr. > > But i found one operation, which does the same like (++ctr < 64) in my > case, but doesn't create the counter in 16bit but in 8bit, as desired: > (++ctr % 64). With this, the result is exactly what i want: > > ldi r24,lo8(0) > .L2: > sts xx,r24 > subi r24,lo8(-(1)) > cpi r24,lo8(64) > brne .L2 > /* epilogue: frame size=0 */ > ret > > Is there any reason for this strange 16bit behaviour? To me, it looks > like the assignemt xx = ctr seems to trigger some signed flags for the > comparison, because ((++ctr & 0x7f) < 64) also produces the desired > 8bit version. Might be okay in this loop, but isn't so fine if i need, > say, 200 as the counter top value. > > What do you think about this? This is strange, and might be a missed optimization bug. I can't see any explanation for the behaviour you see here. What does the output of -fdump-tree-optimized look like? Andrew.