generated movaps with unaligned memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

my shared library crashes with movaps instruction using not aligned memory.

Since the shared library function is being called from dynamic linker, which
basically prepares the memory location, I'm not sure whoose side issues this is.

I have following function in C:

typedef float La_x86_64_xmm __attribute__ ((__vector_size__ (16)));

typedef struct La_x86_64_retval
{
  uint64_t lrv_rax;
  uint64_t lrv_rdx;
  La_x86_64_xmm lrv_xmm0;
  La_x86_64_xmm lrv_xmm1;
  long double lrv_st0;
  long double lrv_st1;
} La_x86_64_retval;

unsigned int la_x86_64_gnu_pltexit (Elf64_Sym *__sym,
                unsigned int __ndx, uintptr_t *__refcook, uintptr_t *__defcook,
                const La_x86_64_regs *__inregs, La_x86_64_retval
*__outregs, const char *symname)
{
        La_x86_64_xmm b __attribute__ ((aligned(16)));
        b = __outregs->lrv_xmm0;
        return 0;
}

this will endup in following assembly:

00000000000007d7 <la_x86_64_gnu_pltexit>:
 7d7:   55                      push   %rbp
 7d8:   48 89 e5                mov    %rsp,%rbp
 7db:   48 89 7d e8             mov    %rdi,-0x18(%rbp)
 7df:   89 75 e4                mov    %esi,-0x1c(%rbp)
 7e2:   48 89 55 d8             mov    %rdx,-0x28(%rbp)
 7e6:   48 89 4d d0             mov    %rcx,-0x30(%rbp)
 7ea:   4c 89 45 c8             mov    %r8,-0x38(%rbp)
 7ee:   4c 89 4d c0             mov    %r9,-0x40(%rbp)
 7f2:   48 8b 45 c0             mov    -0x40(%rbp),%rax
 7f6:   0f 28 40 10             movaps 0x10(%rax),%xmm0
 7fa:   0f 29 45 f0             movaps %xmm0,-0x10(%rbp)
 7fe:   b8 00 00 00 00          mov    $0x0,%eax
 803:   c9                      leaveq
 804:   c3                      retq


Looks like xmm0 register is being used to transfer the data. However
the structure's alignment is not 16, so it will crash.

Now I'm not sure who should take of it?
Should the dynamic linker, who is basically calling this function,
ensure the structure is aligned on 16 bytes?
Or should gcc make sure it works on aligned memory
when emits movaps? (looks more likely to me...)

Also is there any way to ask gcc to emit movups instead movaps?
That would be nice workaround :)

thanks for any hint/ideas/help

here are my gcc spec:
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-cpu=generic
--host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)

regards,
jirka

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux