Hello - I think that I've figured out that this is solely an alignment problem. I've tried all that was suggested below, but the result is the same. To make the problem description clearer: All __m128 variables declared on the stack appear to be 16 byte aligned (at least up to the point where my program crashes). The problem is in my Vector3 template class. But I've not only declared the internal data of the Vector3 class to be 16 byte aligned, but to make doubly sure, I also make the Vector3 type itself 16 byte aligned. I have a camera class which is allocated on the heap via standard new(). The camera looks like this: class Camera{ private: Lens *m_lens; Film *m_film; Point3 m_eye, m_bottomLeft; Vector3 m_up, m_across, m_upL, m_acrossL; real m_angle; real m_1_w,m_1_h; ... }; Currently the program crashes when I am calculating m_acrossL (one of the first computations) because it is aligned on an 8 byte boundary. The Vector3 class currently looks something like: #ifdef __SSE__ #include <xmmintrin.h> template<> class Vector3<float>{ protected: union{ float f[4]; __m128 v; }m; ... }; Note that I've tried pretty much every variant of adding __attribute__ ((aligned(16))) to this, but I really don't think any of that is necessary. Could the problem be that a Camera class cannot be allocated on the heap in such a way that allows 16 byte alignment of the vector data types? This occurs to me now because of what you said earlier about allocating by malloc, and also because my test program ONLY included object on the stack. If this is the case, do I need to use a special memory allocator that does aligned heap allocations? Are any simple libraries available? Thanks, Brian On Tue, 15 Feb 2005 06:50:48 -0600, Eljay Love-Jensen <eljay@xxxxxxxxx> wrote: > Hi Brian, > > As I understand it (in this alignment situation), the __attribute__ is > applicable to a variable, not to a type. > > Also, the alignment of the variable is constrained by whatever alignment > support is provided by the linker. For example, if you need 256 byte > alignment for a PMMU page, many linkers may not support such a big > alignment constraint and will only honor the request to the best of their > ability. > > So for certain things, such as PMMU swap pages, the alignment burden often > falls on the shoulders of the software engineer handling PMMU swap > pages. (It is MOST unfortunate that the malloc routine doesn't have an > extra optional parameter for alignment requirements. Alas and alack.) > > Let's hope that's not the issue you are running into. > > I don't know if __mode__(__V4SF__) entails the alignment request. You'll > have to double-check the documentation. > > Regardless, I think you should replace your typedef with a struct. > > struct __m128 { int m __attribute((__mode__(__V4SF__))); }; > > And if the __mode__(__V4SF__) doesn't entail the alignment constrained, add > in the aligned(16) as well. > > And if your linker doesn't honor the aligned(16) request, you'll have to > take some extra-ordinary steps -- let's hope you don't have to go there. > > HTH, > --Eljay > >