On Fri, 20 Jan 2017, Jeffrey Walton wrote:
Hi Everyone, I have an inline function to to help with a vector extract. It a little more direct, and it avoids most of the conversions required for intrinsics: inline uint64x2_t VEXT_8(uint64x2_t a, uint64x2_t b, unsigned int c) { uint64x2_t r; __asm __volatile("ext %0.16b, %1.16b, %2.16b, %3 \n\t" :"=w" (r) : "w" (a), "w" (b), "I" (c) ); return r; } The compile is failing under Debug builds when no optimizations are used: /opt/cfarm/gcc-latest/bin/g++ -g3 -O0 -march=armv8-a+crc+crypto -D_GLIBCXX_DEBUG -c gcm.cpp gcm.cpp: In function 'uint64x2_t VEXT_8(uint64x2_t, uint64x2_t, unsigned int)': gcm.cpp:90:48: warning: asm operand 3 probably doesn't match constraints :"=w" (r) : "w" (a), "w" (b), "I" (c) ); ^ gcm.cpp:90:48: error: impossible constraint in 'asm' The function is being used like: uint64x2_t c0, c1, c2; ... c2 = veorq_u64(c0, VEXT_8(vdupq_n_u64(0), c1, 8)); The compile for Release builds are fine. Release builds use -g2 and -O3. I'm trying to avoid a template parameter for 'c'. It seems like it should work since the intrinsic works in Debug builds. My question are, is it possible to do this without a template parameter? If so, what machine constraint should we use for 'c'? Thanks in advance
Does the always_inline attribute help? In any case, if you have ways to avoid inline asm...
-- Marc Glisse