Re: Correct way to make a 16-byte aligned double* for SSE vectorization?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/31/2009 08:41 AM, Brian Budge wrote:
The reason it won't work is that you're saying the pointer itself
needs to be 16 (or 8) byte aligned.  You need the address that the
pointer points to to be aligned.

On the stack:

__attribute__ ((aligned(16)) real myArray[32];

On the heap (*nix):
real *myArray;
posix_memalign((void**)&myArray, 16, 32 * sizeof(real));

or for more portability you could use the SSE intrinsic mm_malloc.

To know why the one version you posted works, we'd need to see the
calling code of f.   In general, it shouldn't work if malloc or new
are used to allocate the memory passed in, but it might be that the
memory is allocated on the stack?

   Brian
Hi Brian,

    I think there are two different issues:

1. First, how to actually allocate memory that is 16-byte aligned.
2. Second, how to inform the compiler that a pointer to that memory is in fact has the property p&15L == 0L

I am interested in the second question, whereas I think you are answering the first one.

To know why the one version you posted works, we'd need to see the
calling code of f.   In general, it shouldn't work if malloc or new
are used to allocate the memory passed in, but it might be that the
memory is allocated on the stack?
There is no calling code. That is, I'm not saying that it works when I run it. I am saying that it works (that is, the compiler makes use of the 16-byte alignment of the pointer target) when I compile it.

-BenRI

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux