Re: Is UBSan supposed to produce a finding for _mm_load_sd and _mm_store_sd

Marc Glisse <marc.glisse@xxxxxxxx> · Fri, 8 Dec 2017 18:28:55 +0100 (CET)

On Fri, 8 Dec 2017, Jeffrey Walton wrote:

I have some code that loads a 64-bit integer into a XMM register. It
loads the integer from a byte array:

   byte v[8] = ...
   __m128i t = _mm_castpd_si128(
       _mm_load_sd((const double *)(v)));

It is producing a finding for an unaligned load. I get similar
findings for _mm_load_sd, _mm_store_sd and _mm_loaddup_pd.

According to the Intel Intrinsics Guide (e.g., _mm_load_sd):

   Load a double-precision (64-bit) floating-point element from memory
   into the lower of dst, and zero the upper element. mem_addr does
   not need to be aligned on any particular boundary.

Should GCC be producing a finding in this case? Is there a way to work
around it without an extra memcpy?

The way _mm_load_sd is currently implemented in gcc, yes, sanitizers are 
right to complain. Intel could have named the thing _mm_loadu_sd if that's 
what they meant. It would be simple to change if we decide to do so, 
please file a PR in bugzilla.

Workaround: define a typedef for double with 
__attribute__((__aligned__(1))), and use _mm_set_sd(*(newtype*)p), that's 
how it will likely be done if we change emmintrin.h.

--
Marc Glisse