On 04/07/2011 18:57, Georg-Johann Lay wrote:
Hi,
this it not a question on GCC, so I apologize for asking a question on
C strict aliasing rules here. As I know that some people reading this
list are much more familiar with C standard than I am, allow me to ask
that question, anyway.
Suppose the following C code that tries to implement the standard
copysign function, i.e. copy the sign of y into x and return that
value. double be 64 bits wide:
#define MASK ((unsigned short) 0x8000)
double copysign (double x, double y)
{
unsigned short * const px = (unsigned short*)(char*)&x + 3;
unsigned short * const py = (unsigned short*)(char*)&y + 3;
*px = *px& ~MASK | *py& MASK;
return x;
}
While I say that this code is not correct because it breaks C's strict
aliasing rules (e.g C89/90, Chapter 6.3; C98/99, Chapter 6.5, Clause
7), some other person very well familiar with the standard claims that
is correct and no problem.
So I want to reassure me if the code is ok or not.
I am aware of gcc's -f[no]-strict-aliasing switch, but that's rather
technical detail to cope with such presumably incorrect code.
Thanks,
Johann
BTW, gcc does just what I would expect: it returns x unchanged and
that is not an "optimizer bug".
Does it get compiled to an empty function?
Anyway, you are correct that strict aliasing means that this function is
buggy - the compiler can assume that a "double" such as x or y cannot
share the same address as a "short" (or "unsigned short"), thus the
pointer "px" cannot point to x, thus x is not changed by the function,
and can be returned unchanged.
The extra "(char*)" in the cast may be an attempt to work around
aliasing, but it won't work.
In other words, the function is buggy and unclear - never a good
combination. And for the 8-bit AVR (I'm guessing that's the target you
are interested in...), it is inefficient as well.
The code assumes 8-byte doubles, 16-bit shorts and little-endian
ordering. Obviously that's a common combination, but it's not the only
possibility - and IIRC the AVR uses 32-bit doubles at the moment (and
some embedded targets allow 32-bit doubles as an option for efficiency).
It would be /far/ better to stick to char pointers - these may alias
anything, and are more efficient on an 8-bit cpu (without being /too/
bad on most other targets). The alternative would be to use union types
to do the casting - that would make more sense on 32-bit RISC targets
without direct byte operations (such as earlier ARMs).
double copysign (double x, double y)
{
unsigned char * const px = (unsigned char*)(&x + 1) - 1;
const unsigned char * const py = (unsigned char*)(&y + 1) - 1;
*px = (*px & ~0x80) | (*py & 0x80);
return x;
}
This should make px and py point to the last bytes of x and y,
regardless of the size of a double. I've added a couple more brackets,
and removed the MASK macro - that's just wanton name-space pollution.
It's still little-endian only, of course.