On Fri, 8 Dec 2017, Jeffrey Walton wrote:
I have some code that loads a 64-bit integer into a XMM register. It loads the integer from a byte array: byte v[8] = ... __m128i t = _mm_castpd_si128( _mm_load_sd((const double *)(v))); It is producing a finding for an unaligned load. I get similar findings for _mm_load_sd, _mm_store_sd and _mm_loaddup_pd. According to the Intel Intrinsics Guide (e.g., _mm_load_sd): Load a double-precision (64-bit) floating-point element from memory into the lower of dst, and zero the upper element. mem_addr does not need to be aligned on any particular boundary. Should GCC be producing a finding in this case? Is there a way to work around it without an extra memcpy?
The way _mm_load_sd is currently implemented in gcc, yes, sanitizers are right to complain. Intel could have named the thing _mm_loadu_sd if that's what they meant. It would be simple to change if we decide to do so, please file a PR in bugzilla.
Workaround: define a typedef for double with __attribute__((__aligned__(1))), and use _mm_set_sd(*(newtype*)p), that's how it will likely be done if we change emmintrin.h.
-- Marc Glisse