Hello!
I have the following piece of C++ code which is supposed to make 16-bit
integers from 8-bit lo-byte/hi-byte values in an super-efficient way.
However I'm a bit puzzled about the (i386) code generated by GCC.
----------------------------------------------------------------------
#include <iostream>
#include <stdint.h>
class Int16
{
private:
union {
uint_fast16_t i16;
uint8_t i8[sizeof (uint_fast16_t)];
};
static const int lo_byte_pos = 0;
static const int hi_byte_pos = 1;
public:
inline void set(uint_fast16_t v) { i16 = v; }
inline void setL(uint8_t v) { i8[lo_byte_pos] = v; }
inline void setH(uint8_t v) { i8[hi_byte_pos] = v; }
inline uint_fast16_t get() { return i16; }
};
volatile uint8_t hi = 0x12, lo = 0x34;
int main()
{
Int16 i;
i.set(lo);
i.setL(lo);
i.setH(hi);
std::cout << i.get() << std::endl;
}
----------------------------------------------------------------------
The i386 code generated by GCC 4.0.1 (-O3) for the interesting part of
main() is:
movb lo, %dl
movzbl %dl, %edx
movb lo, %al
movb %al, %dl
movb hi, %al
movzbl %al, %eax
movb %al, %dh
Which is obviously a pretty good job, but what still disturbs me a bit is;
1/ That GCC fails to transform
movb lo, %al
movb %al, %dl
movb hi, %al
movb %al, %dh
into
movb lo, %dl
movb hi, %dh
2/ The completely unnecessary zero-extension of the argument to setH():
movzbl %al, %eax
The funny thing is that this does not happen for setL(), even though
they have exactly the same declaration.
Unfortunately I haven't been able to try it with 4.1.0 yet...
Any way around this? Can anyone enlighten me on what's going on?
Cheers,
--
Christer Palm