Re: alignment issues for sse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello -

I think that I've figured out that this is solely an alignment
problem.  I've tried all that was suggested below, but the result is
the same.

To make the problem description clearer:

All __m128 variables declared on the stack appear to be 16 byte
aligned (at least up to the point where my program crashes).  The
problem is in my Vector3 template class.  But I've not only declared
the internal data of the Vector3 class to be 16 byte aligned, but to
make doubly sure, I also make the Vector3 type itself 16 byte aligned.

I have a camera class which is allocated on the heap via standard
new().  The camera looks like this:
class Camera{
private:
   Lens *m_lens;
   Film *m_film;
   Point3 m_eye, m_bottomLeft;
   Vector3 m_up, m_across, m_upL, m_acrossL;
   real m_angle;
   real m_1_w,m_1_h;
...
};

Currently the program crashes when I am calculating m_acrossL (one of
the first computations) because it is aligned on an 8 byte boundary.

The Vector3 class currently looks something like:

#ifdef __SSE__
#include <xmmintrin.h>
template<>
class Vector3<float>{
protected:
    union{
	float f[4];
	__m128 v;
    }m; 
...
};

Note that I've tried pretty much every variant of adding __attribute__
((aligned(16))) to this, but I really don't think any of that is
necessary.

Could the problem be that a Camera class cannot be allocated on the
heap in such a way that allows 16 byte alignment of the vector data
types?  This occurs to me now because of what you said earlier about
allocating by malloc, and also because my test program ONLY included
object on the stack.  If this is the case, do I need to use a special
memory allocator that does aligned heap allocations?  Are any simple
libraries available?

Thanks,
  Brian



On Tue, 15 Feb 2005 06:50:48 -0600, Eljay Love-Jensen <eljay@xxxxxxxxx> wrote:
> Hi Brian,
> 
> As I understand it (in this alignment situation), the __attribute__ is
> applicable to a variable, not to a type.
> 
> Also, the alignment of the variable is constrained by whatever alignment
> support is provided by the linker.  For example, if you need 256 byte
> alignment for a PMMU page, many linkers may not support such a big
> alignment constraint and will only honor the request to the best of their
> ability.
> 
> So for certain things, such as PMMU swap pages, the alignment burden often
> falls on the shoulders of the software engineer handling PMMU swap
> pages.  (It is MOST unfortunate that the malloc routine doesn't have an
> extra optional parameter for alignment requirements.  Alas and alack.)
> 
> Let's hope that's not the issue you are running into.
> 
> I don't know if __mode__(__V4SF__) entails the alignment request.  You'll
> have to double-check the documentation.
> 
> Regardless, I think you should replace your typedef with a struct.
> 
> struct __m128 { int m __attribute((__mode__(__V4SF__))); };
> 
> And if the __mode__(__V4SF__) doesn't entail the alignment constrained, add
> in the aligned(16) as well.
> 
> And if your linker doesn't honor the aligned(16) request, you'll have to
> take some extra-ordinary steps -- let's hope you don't have to go there.
> 
> HTH,
> --Eljay
> 
>

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux