64 bit register use for inline assembly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have come across this unusual scenario and am not exactly sure if it
was a compiler bug that is now fixed or something is not quite right
with the code.

What happens in the following code is that the p array is stored in a
64 bit register (%rdx) when using optimization (like -O2) on gcc 4.1.2
and thus the inline assembly zero extends away the upper 32 bits when
doing the first STORE32H and then the second STORE32H gets 0x00000000
for its value.
Attached below is a stripped down test case and the resulting assembly
code for the compiler and options.
Modifiers that fix it for 4.1.2: use -m32,  use no optimizations
This works fine in gcc 4.2.3 on the same machine and on another linux
OS (I haven't tried the newest gcc version yet). It optimizes the code
but doesn't use the 64 bit register.
My understanding of the inline assembly is that the compiler is
responsible for knowing to protect the registers it dynamically uses,
and therefore putting something in the clobber list doesn't help.
I searched the gcc bugzilla extensively and haven't seen anything that
specifically addresses this. It may have been fixed as a side effect
of something else but I didn't want to file a bug since it works in a
newer version.

Thanks for any info,
Derek

Hardware: AMD Phenom Quad core 64 bit

Test code:

//*************************************************
// test.c
#include <stdio.h>

typedef unsigned ulong32;

#define STORE32H(x, y)            \
  asm __volatile__ (              \
    "bswapl %0     \n\t"          \
    "movl   %0,(%1)\n\t"          \
    "bswapl %0     \n\t"          \
    ::"r"(x), "r"(y));

static void pxor(ulong32 *p)
{
   p[1] ^= p[0];
}

int main(void)
{
   ulong32 p[2] = {0x00010001, 0xaaaaaaaa};
   unsigned char ctt[8];
   pxor(p);
   STORE32H(p[0], ctt);
   STORE32H(p[1], ctt+4);
   //Should be 0x00010001
   printf("ctt: 0x%02x%02x%02x%02x", ctt[0],ctt[1],ctt[2],ctt[3]);
   //Should be 0xaaabaaab
   printf(" 0x%02x%02x%02x%02x\n", ctt[4],ctt[5],ctt[6],ctt[7]);

   printf(" sizeof ulong32 should be 4: %lu\n", sizeof(ulong32));
   return 0;
}

//******************************************************************8

GCC 4.1.2
 gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-cpu=generic
--host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)

cc -g -Wall -W -O2   -c -o test.o test.c


main:

    0x00000000004004c0 <main+0>:    sub    $0x18,%rsp
    0x00000000004004c4 <main+4>:    mov    $0xaaabaaab00010001,%rdx
    0x00000000004004ce <main+14>:   mov    %rsp,%rax
    0x00000000004004d1 <main+17>:   bswap  %edx
    0x00000000004004d3 <main+19>:   mov    %edx,(%rsp)
    0x00000000004004d6 <main+22>:   bswap  %edx
    0x00000000004004d8 <main+24>:   shr    $0x20,%rdx
    0x00000000004004dc <main+28>:   add    $0x4,%rax
    0x00000000004004e0 <main+32>:   bswap  %edx
    0x00000000004004e2 <main+34>:   mov    %edx,(%rax)
    0x00000000004004e4 <main+36>:   bswap  %edx


with GCC 4.2.3
 /tmp/usr/local/bin/gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ./configure
Thread model: posix
gcc version 4.2.3

/tmp/usr/local/bin/gcc -g -Wall -W -O2   -c -o test.o test.c

    0x0000000000400480 <main+0>:    sub    $0x18,%rsp
    0x0000000000400484 <main+4>:    mov    $0x10001,%edx
    0x0000000000400489 <main+9>:    mov    %rsp,%rax
    0x000000000040048c <main+12>:   bswap  %edx
    0x000000000040048e <main+14>:   mov    %edx,(%rsp)
    0x0000000000400491 <main+17>:   bswap  %edx
    0x0000000000400493 <main+19>:   mov    $0xaaabaaab,%edx
    0x0000000000400498 <main+24>:   add    $0x4,%rax
    0x000000000040049c <main+28>:   bswap  %edx
    0x000000000040049e <main+30>:   mov    %edx,(%rax)
    0x00000000004004a0 <main+32>:   bswap  %edx

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux