Hi there!
I'm having trouble using the vector-extensions, using GCC 4.2.1 on
Windows XP. I have a simple class for 3D-vectors that I like to speed up
using the extensions. The class looks like this:
*#define* USE_GCC_VECTORISATION (defined(__SSE__) && 1)
*class* Vector3D
{
*public*:
//*
* ... the usual operators
*//
*private*:*
#if* USE_GCC_VECTORISATION
*typedef float* v4sf __attribute__ (( vector_size(4 *
*sizeof*(*float*)) ));
*typedef int* v4si __attribute__ (( vector_size(4 *
*sizeof*(*int*)) ));
*union*
{
*float* v[4];
v4sf vec;
};
Vector3D(*const* v4sf& v);*
#else
float* v[3];*
#endif*
};
Now, as soon as I activate the vectorization, I get segfaults whenever a
vector-op is involved. E.g. this function always segfaults:
Triangle::Triangle(const std::string& name, const Material& material,
const Vector3D& v1, const Vector3D& v2, const
Vector3D& v3)
: TraceableObject(name, material)
{
v[0] = v1;
v[1] = v2;
v[2] = v3;
normal = crossProduct(v2 - v1, v3 - v1); // segfault when
computing v2 - v1
normal.normalize();
planeD = dotProduct(normal, v1);
}
The corresponding operator- looks like this:
*inline const* Vector3D *operator*-(*const* Vector3D& lhs, *const*
Vector3D& rhs)
{*
#if* USE_GCC_VECTORISATION
*return* Vector3D(lhs.vec - rhs.vec);*
#else*
Vector3D retval(lhs);
retval -= rhs;
*return* retval;*
#endif*
}
More precisely, this is the disassembled code the debugger shows me:
Frame *function*: operator-(Vector3D const&, Vector3D const&)
(src/Vector3D.cpp:138)
Frame address : 0x0022FA60
--------------------------------------------------------------------------------
0x4f07d0 *push* %*ebp*
0x4f07d1 *mov* %*esp*,%*ebp*
0x4f07d3 *push* %*ebx*
0x4f07d4 *sub* $0x1c,%*esp*
0x4f07d7 *mov* 0x8(%*ebp*),%*ebx*
0x4f07da *mov* %*ebx*,%*edx*
0x4f07dc *mov* 0xc(%*ebp*),%*eax*
0x4f07df movaps (%*eax*),%xmm1 /; <<<<-----the segault happens in
this instruction/
0x4f07e2 *mov* 0x10(%*ebp*),%*eax*
0x4f07e5 movaps (%*eax*),%xmm0
0x4f07e8 movaps %xmm1,%xmm2
0x4f07eb subps %xmm0,%xmm2
0x4f07ee movaps %xmm2,%xmm0
0x4f07f1 movaps %xmm0,0xffffffe8(%*ebp*)
0x4f07f5 *lea* 0xffffffe8(%*ebp*),%*eax*
0x4f07f8 *mov* %*eax*,0x4(%*esp*)
0x4f07fc *mov* %*edx*,(%*esp*)
0x4f07ff *call* 0x46cefc <Vector3D::Vector3D(float __vector const&)>
0x4f0804 *mov* %*ebx*,%*eax*
0x4f0806 *add* $0x1c,%*esp*
0x4f0809 *pop* %*ebx*
0x4f080a *pop* %*ebp*
0x4f080b *ret* $0x4
At the moment of the segfault, EAX contains the address of lhs. Every
other piece of data also seems to have the expected values in it.
Therefore I can't see any error neither in my usage of the
vector-extensions nor in the generated code. To further mystify things,
the following code works like a charm (no segfaults, no nothing,
yielding the correct output):
*#include* "_Vector3D.h_"
*int* main ( *int* argc, *char*** argv )
{
/// this is quite exactly what happens in my "real" program/
Vector3D a(-4, -3, 10);
Vector3D b(-4, 5, 10);
Vector3D c = a - b;
std::cout << c << std::endl;
*return* 0;
}
So I fail to see what is really going on. Any hints?
best regards
Thomas Unterthiner