On Tue, Jul 2, 2019 at 5:33 PM Marc Glisse <marc.glisse@xxxxxxxx> wrote: > > On Tue, 2 Jul 2019, Jeffrey Walton wrote: > > > We caught a bug report for AMD XOP's _mm_roti_epi64. We are using > > <ammintrin.h> for XOP: > > > > #if defined(__XOP__) > > # include <ammintrin.h> > > #endif > > > > We use <ammintrin.h> based on Microsoft docs and: > > > > $ sudo find / -name ammintrin.h > > /usr/lib/gcc/x86_64-redhat-linux/8/include/ammintrin.h > > /usr/lib64/clang/7.0.1/include/ammintrin.h > > > > However, GCC is dying on _mm_roti_epi64: > > > > g++ -DNDEBUG -g2 -O3 -march=bdver1 -msse4.1 -c blake2b_simd.cpp > > blake2b_simd.cpp: In function ‘void BLAKE2_Compress64_SSE4(const > > byte*, BLAKE2b_State&)’: > > blake2b_simd.cpp:368:5: error: ‘_mm_roti_epi64’ was not declared in this scope > > _mm_roti_epi64(r, c) > > ^~~~~~~~~~~~~~ > > > > -march=bdver1 is added to CXXFLAGS by the user according to > > https://bugs.gentoo.org/689162. -msse4.1 is added to the source file > > unconditionally per > > https://www.gnu.org/prep/standards/html_node/Command-Variables.html . > > > > Including <immintrin.h> does not help (per > > http://www.g-truc.net/post-0359.html): > > > > #if defined(__XOP__) > > # include <immintrin.h> > > # include <ammintrin.h> > > #endif > > > > This may be helpful: > > > > $ lsb_release -a > > Distributor ID: Fedora > > Description: Fedora release 29 (Twenty Nine) > > Release: 29 > > Codename: TwentyNine > > > > $ g++ --version > > g++ (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) > > > > Which header should we be using for AMD XOP? We are looking for a > > standard, AMD header. > > $ grep _mm_roti_epi64 * > xopintrin.h:_mm_roti_epi64(__m128i __A, const int __B) > xopintrin.h:#define _mm_roti_epi64(A, N) \ > > $ grep xopintrin.h * > x86intrin.h:#include <xopintrin.h> > xopintrin.h:# error "Never use <xopintrin.h> directly; include <x86intrin.h> instead." > > so much work... Thanks Marc. We don't use <x86intrin.h> because it is missing on too many systems we support (even those that use GCC). Is there a way to get GCC to provide the functions through a system header as expected? Jeff