Re: alignment issues for sse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 16 Feb 2005, corey taylor wrote:

I see the definition, but I've never seen any documentation on them.
Can you point to some useful documentation?

See Intel C++ Compiler User's Guide (copy-pasted):

 Use the _mm_malloc and _mm_free intrinsics to allocate and free aligned
 blocks of memory. These intrinsics are based on malloc and free, which
 are in the libirc.a library. You need to include malloc.h. The syntax for
 these intrinsics is as follows:

 void* _mm_malloc (int size, int align)

 void _mm_free (void *p)

 The _mm_malloc routine takes an extra parameter, which is the alignment
 constraint. This constraint must be a power of two. The pointer that is
 returned from _mm_malloc is guaranteed to be aligned on the specified
 boundary.

 Note

 Memory that is allocated using _mm_malloc must be freed using _mm_free .
 Calling free on memory allocated with _mm_malloc or calling _mm_free on
 memory allocated with malloc will cause unpredictable behavior.

From gcc's version of xmmintrin.h:

/* Implemented from the specification included in the Intel C++ Compiler User Guide and Reference, version 8.0. */

So gcc implements these for icc compatibility.

K



Currently, we develop for many platforms, so portability is better in most instances although we do use MMX and some SSE for speed where available.

corey


On Thu, 17 Feb 2005 01:51:44 +0200 (EET), Kimmo Fredriksson <kfredrik@xxxxxxxxxxxxx> wrote:
Hi,

[Disclaimer: I haven't really been following this discussion...]

On Wed, 16 Feb 2005, corey taylor wrote:

 However, after looking into the current public project I'm on, I
realize that it doesn't use SSE for the allocation.  It simply
advances to an aligned location and manually forces the alignment,
hides the actual allocation pointer, and returns the aligned pointer.

Why not use:

void * _mm_malloc (size_t size, size_t alignment)
void _mm_free (void * ptr)

?

Defined in xmmintrin.h (I think).

On Wed, 16 Feb 2005 17:58:15 +0100, Brian Budge <brian.budge@xxxxxxxxx> wrote:

On Wed, 16 Feb 2005 10:46:54 -0600, corey taylor <corey.taylor@xxxxxxxxx> wrote:
Implementation's I've used and worked on always do aligned allocations
manually.  Typically the hidden and real sizes of the allocation are
put into the memory allocation itself and the returned pointer is
incremented a few bytes.  The downside to this is that you must be
strict in using the aligned free routine also.

See above.

On Wed, 16 Feb 2005 10:09:27 -0600, Eljay Love-Jensen <eljay@xxxxxxxxx> wrote:

But surely thousands of people are writing sse code... how do they make
it work?

I presume by taking measures to assure the SSE structs are properly
aligned.

Do I need to switch to the intel compiler/linker?

I do not know.

I do not know either, but that was my solution...

But: my sse code used to work just fine with gcc. Then something happened,
and I just get seg faults. Don't remember exactly anymore, but I think at
the time it actually worked with gcc, I was using some early gcc 3.4
snapshot, since it was the only one that worked. No version before, no
version after (that I have tried, excluding e.g. 4.0)... And of course
there is also the possibility that something else changed, I do/did
something wrong, etc. Anyways, currently I use icc for sse code, and use
_mm_malloc/_mm_free for dynamic allocation, statics are automagically 16
byte aligned.

For other things, I still use mostly gcc.

K




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux