Re: alignment issues for sse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Brian,

>Could the problem be that a Camera class cannot be allocated on the heap in such a way that allows 16 byte alignment of the vector data types?

Oh yes, I believe that is very possibly the problem.

On my system, it appears that the memory allocation is fixed as if __attribute__((aligned(8))) is imposed on the allocation. (This has no bearing on padding.)

For example:
struct three { char m[3]; };
three* p = new three[4];

The addresses could be...
&p[0] == 0x10008;
&p[1] == 0x1000B;
&p[2] == 0x1000E;
&p[3] == 0x10011;

Notice that the first one is aligned on an 8th byte boundary.

The alignment "promise" of the heap management subsystem is platform dependent. As far as I am aware, there is no standard means to communicate alignment requirements to the heap manager. :-(

Some heap managers, such as the one with SAS/C++, have lots of knobs to programmatically tweak heap manager behavior. But that kind of API is not standard C or C++, and I'd be hesitant to rely upon it if portability is a concern (and for me, it is always a concern).

>This occurs to me now because of what you said earlier about allocating by malloc, and also because my test program ONLY included object on the stack.

Serendipitous comment!  :-)

>If this is the case, do I need to use a special memory allocator that does aligned heap allocations?

Yes. In C++, you can override the new, new[], delete and delete[] operators of your class and instrument in the desired alignment behavior.

Alternatively, you can create your own custom allocator object -- but I'm not familiar with the caveats / pitfalls / worries of that technique.

Alternatively alternatively, you could perform the alignment yourself by kluge-magic, such as:

struct my_m128
{
  char m[32]; // auto-align.
  operator __m128& () { return *(int*)(&m[(int)(&m[0]) & 0x0F]); }
};

The gotchya is the wasted space, which is only worrisome for arrays.

I think your best bet is to manage your own __m128 only mini-heap manager.

>Are any simple libraries available?

Not to my knowledge. I do know that there are several high performance heap replacement libraries (each one is tuned for different performance characteristics) -- but I do not know the details about any of them. I wouldn't be surprised if one-or-more of them are tunable to allocating only on 16th byte addresses.

Side note: some heap management libraries are useful for debugging -- double deletes / double free, overruns, underruns, scrubbing deallocated memory with a known garbage value (e.g., 0xDEADBEEF), unreleased memory at program termination (leaks), et cetera. These can be a very useful tools for the developer's arsenal.

--Eljay


[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux