On 03/05/17 10:38, Richard Earnshaw (lists) wrote:
On 19/04/17 16:38, Jeffrey Walton wrote:
On Intel hardware, we can assemble inline assembly instructions as
long as the assembler supports it. For example, I can use inline
assembly to provide AES, CRC, CLMUL and SHA even when compiling with
just -march=x86_64.
I'm trying to do the same on ARM, but it results in an error. The
hardware is a LeMaker HiKey. It has an ARMv8/Aarch64 A-53 with CRC and
Crypto:
$ cat /proc/cpuinfo
Processor : AArch64 Processor rev 3 (aarch64)
processor : 0
...
processor : 7
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: AArch64
And:
$ g++ --save-temps test.cxx -c
test.s: Assembler messages:
test.s:24: Error: selected processor does not support `crc32b w1,w0,w0'
I'm guessing the different behaviors are unintended. My first question
is, is this expected behavior?
If its unintended, then my second question is, is this a GCC or GAS issue?
This is expected behaviour. The CRC instructions are not part of the
base instruction set (ARM-v8), so the assembler is diagnosing that
you've used instructions that are incompatible with
.cpu generic+fp+simd
If you want to fine-tune the instructions available, you'll need to
invoke GCC with either an specific cpu where the instructions exist (eg
cortex-a53) or with a -march directive that enables the additional
instructions (eg -march=armv8+crc).
R.
***********
hikey: $ cat test.s
.cpu generic+fp+simd
.file "test.cxx"
.text
.align 2
.global main
.type main, %function
main:
.LFB2948:
.cfi_startproc
sub sp, sp, #32
.cfi_def_cfa_offset 32
str w0, [sp, 12]
str x1, [sp]
ldr w0, [sp, 12]
ldr w1, [sp, 12]
uxtb w1, w1
str w0, [sp, 28]
mov w0, w1
strb w0, [sp, 27]
ldr w0, [sp, 28]
ldrb w1, [sp, 27]
#APP
// 10 "test.cxx" 1
crc32b w1, w0, w0
// 0 "" 2
#NO_APP
str w0, [sp, 20]
ldr w0, [sp, 20]
add sp, sp, 32
.cfi_def_cfa_offset 0
ret
.cfi_endproc
.LFE2948:
.size main, .-main
.ident "GCC: (Debian/Linaro 4.9.2-10) 4.9.2"
.section .note.GNU-stack,"",%progbits
***********
hikey: $ cat test.cxx
#include <arm_neon.h>
I'll also add that the __crc32b intrinsic is available through the arm_acle.h
header rather than arm_neon.h.
Kyrill
#define GCC_INLINE_ATTRIB __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
#if defined(__GNUC__) && !defined(__ARM_FEATURE_CRC32)
__inline unsigned int GCC_INLINE_ATTRIB
CRC32B(unsigned int crc, unsigned char v)
{
unsigned int r;
asm ("crc32b %w2, %w1, %w0" : "=r"(r) : "r"(crc), "r"(v));
return r;
}
#else
# define CRC32B (a,b) __crc32b(a,b)
#endif
int main(int argc, char* argv[])
{
return CRC32B(argc, argc);
}