On Tue, 2 Jul 2019, Jeffrey Walton wrote:
We caught a bug report for AMD XOP's _mm_roti_epi64. We are using <ammintrin.h> for XOP: #if defined(__XOP__) # include <ammintrin.h> #endif We use <ammintrin.h> based on Microsoft docs and: $ sudo find / -name ammintrin.h /usr/lib/gcc/x86_64-redhat-linux/8/include/ammintrin.h /usr/lib64/clang/7.0.1/include/ammintrin.h However, GCC is dying on _mm_roti_epi64: g++ -DNDEBUG -g2 -O3 -march=bdver1 -msse4.1 -c blake2b_simd.cpp blake2b_simd.cpp: In function ‘void BLAKE2_Compress64_SSE4(const byte*, BLAKE2b_State&)’: blake2b_simd.cpp:368:5: error: ‘_mm_roti_epi64’ was not declared in this scope _mm_roti_epi64(r, c) ^~~~~~~~~~~~~~ -march=bdver1 is added to CXXFLAGS by the user according to https://bugs.gentoo.org/689162. -msse4.1 is added to the source file unconditionally per https://www.gnu.org/prep/standards/html_node/Command-Variables.html . Including <immintrin.h> does not help (per http://www.g-truc.net/post-0359.html): #if defined(__XOP__) # include <immintrin.h> # include <ammintrin.h> #endif This may be helpful: $ lsb_release -a Distributor ID: Fedora Description: Fedora release 29 (Twenty Nine) Release: 29 Codename: TwentyNine $ g++ --version g++ (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) Which header should we be using for AMD XOP? We are looking for a standard, AMD header.
$ grep _mm_roti_epi64 * xopintrin.h:_mm_roti_epi64(__m128i __A, const int __B) xopintrin.h:#define _mm_roti_epi64(A, N) \ $ grep xopintrin.h * x86intrin.h:#include <xopintrin.h> xopintrin.h:# error "Never use <xopintrin.h> directly; include <x86intrin.h> instead." so much work... -- Marc Glisse