Strange gcc "de-optimizations" happening on SH target

"Colby Boles" <cboles@xxxxxxxxxxx> · Fri, 13 Jan 2006 14:06:46 -0800

I'm writing code for the SH2 processor family (SH7055 specifically)
using a gcc-3.4.1 cross-compiler. I'm runing into a strange
"optimization" bug with the compiler. If I try to AND an unsigned char
memory mapped peripheral register with a constant that is has only one
'1' bit, in this example 0x40, the compiler tries to "optimize" the
comparison by right shifting a copy of the register 6 times and then
comparing it to 1, which on this target is certainly a long winded way
of doing things. By contrast if the bit mask is instead 0x41, the
resulting assembly code is less than half as long and faster. I get the
same results regardless of the optimization options I set. Any ideas why
this is happening or how I can turn this "optimization" off?

Colby

// real code which has the "optimization" problem

#define   SSR1   (*((volatile unsigned char *) 0xFFFFF00C))
if (SSR & (unsigned char)0x40) // test RDRF

// generated assembly snippet

 179 00ae 6030           mov.b   @r3,r0
 180 00b0 4009           shlr2   r0
 181 00b2 4009           shlr2   r0
 182 00b4 4009           shlr2   r0
 183 00b6 C901           and   #1,r0
 184 00b8 2008           tst   r0,r0
 185 00ba 8901           bt   .L20

// sample of how changing the constant to 0x41 prevents the
"optimization"
// from being possible and makes for faster, smaller code

#define   SSR1   (*((volatile unsigned char *) 0xFFFFF00C))
if (SSR & (unsigned char)0x41) // sample mask with two '1' bits

// generated assembly snippet

 179 00ae 6030           mov.b   @r3,r0
 180 00b0 C841           tst   #65,r0
 181 00b2 8901           bt   .L20