Hi Brian,
How are you making sure that your foo in...
_m128 foo = bar;
...which is allocated on the stack is 16-byte aligned?
What happens when you do this...
_m128 foo __attribute__((aligned(16))) = bar;
...?
Or do you have the __attribute__((aligned(16))) in your class?
--Eljay