On 21 August 2018 at 18:50, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote: > On Tue, Aug 21, 2018 at 5:23 AM Ard Biesheuvel > <ard.biesheuvel@xxxxxxxxxx> wrote: >> >> Hi Nick, >> >> On 21 August 2018 at 00:40, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote: >> > It seems that: >> > ldr q8, =0x30000000200000001 >> > >> > is a GNU as convience notation for: >> > ldr q8, .Lconstant >> > .Lconstant >> > .word 0x00000001 >> > .word 0x00000002 >> > .word 0x00000003 >> > .word 0x00000000 >> > >> > based on this comment in binutils' source [0]. I've asked for this >> > non-standard convience notation to be added to other assemblers [1], but >> > until then, we can remove it and get equivalent disassembly: >> > >> >> What do you mean by 'non-standard convenience notation'? Which asm >> 'standard' does Clang actually claim to implement? > > Well, for assembly 'standard' is a bit nebulous. But it's frustrating > when you can't find what the `=0x...` notation means in either the ARM > or GNU as manuals. The source of truth happened to be a comment in > the source [0] that explained that this was a "programmer friendly > notation." Sure, if you can find it. > Well, it is documented in the ARM manuals as the 'ldr pseudo instruction' for 32-bit ARM, and originally, it would only emit a literal if the value could not be loaded using a ordinary mov. For AArch64, I don't think that occurs any longer, i.e., a ldr is always emitted as an ldr. However, it is used widely in the ARM code (although not as much in arch/arm64) for loading compile time constants and even symbol addresses into general purpose registers. The FP variant may actually be a GNU invention, but given that there is no standard to begin with, playing the 'GNU is non-standard' card again is just a bit annoying. >> >> This 'GCC/binutils is broken and we are reluctant to subvert Clang' >> attitude is getting a bit old tbh. In the future, could you please >> mention clang explicitly in the patch subject so it is obvious what >> the purpose of the patch is? I think we should accommodate Clang in >> Linux, but the attitude has got to go. > > 'Reluctant?' Please assume good faith; I filed the bug/noticed > yesterday [1]. And I'm in favor of supporting the shorthand. That's > not my argument at all; Strawman. > OK, fair enough. But note that Clang already supports the syntax for GPRs. >> >> >> > before: >> > 00000000000009d4 <neon_aes_ctr_encrypt>: >> > ... >> > a48: 9c000ac8 ldr q8, ba0 <neon_aes_ctr_encrypt+0x1cc> >> > ... >> > ba0: 00000001 .word 0x00000001 >> > ba4: 00000002 .word 0x00000002 >> > ba8: 00000003 .word 0x00000003 >> > bac: 00000000 .word 0x00000000 >> > >> > after: >> > >> > 00000000000009d4 <neon_aes_ctr_encrypt>: >> > ... >> > a48: 9c000aa8 ldr q8, b9c <neon_aes_ctr_encrypt+0x1c8> >> > ... >> > b9c: 00000001 .word 0x00000001 >> > ba0: 00000002 .word 0x00000002 >> > ba4: 00000003 .word 0x00000003 >> > ba8: 00000000 .word 0x00000000 >> > >> > [0] https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gas/testsuite/gas/aarch64/programmer-friendly.s;h=6254c6476efdc848648b05068be0574e7addc85d;hb=HEAD#l11 >> > [1] https://bugs.llvm.org/show_bug.cgi?id=38642 >> > >> > Signed-off-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> >> > --- >> > arch/arm64/crypto/aes-modes.S | 8 +++++++- >> > 1 file changed, 7 insertions(+), 1 deletion(-) >> > >> > diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S >> > index 483a7130cf0e..9288c5b0eca2 100644 >> > --- a/arch/arm64/crypto/aes-modes.S >> > +++ b/arch/arm64/crypto/aes-modes.S >> > @@ -232,7 +232,7 @@ AES_ENTRY(aes_ctr_encrypt) >> > bmi .Lctr1x >> > cmn w6, #4 /* 32 bit overflow? */ >> > bcs .Lctr1x >> > - ldr q8, =0x30000000200000001 /* addends 1,2,3[,0] */ >> > + ldr q8, .Laddends /* addends 1,2,3[,0] */ >> > dup v7.4s, w6 >> > mov v0.16b, v4.16b >> > add v7.4s, v7.4s, v8.4s >> > @@ -295,6 +295,12 @@ AES_ENTRY(aes_ctr_encrypt) >> > rev x7, x7 >> > ins v4.d[0], x7 >> > b .Lctrcarrydone >> > + >> > +.Laddends: >> > + .word 0x00000001 >> > + .word 0x00000002 >> > + .word 0x00000003 >> > + .word 0x00000000 >> > > >> As Will points out, this breaks BE builds. > > Looks like -EB and -EL are the flags for me to test with in GNU as > (looks like a few different flags for this). Is there a target triple > ABI for BE? I'll ask around here too, and post my findings. > >> >> > AES_ENDPROC(aes_ctr_encrypt) >> > .ltorg >> >> You can drop this .ltorg if you get rid of all =xxx instances. >> >> Actually, though, I would prefer it if we could use some clever short >> sequence of movi+add instructions, and get rid of the literal >> altogether. Or otherwise, please use something like >> >> ldr_l q8, .Laddends, <temp register> >> >> instead, and move the literal into the .rodata section (and use .octa >> so you don't break BE) > > Ah, .octa, neat. Ok, I'll take a look for v2. > Please check my solution first. I just sent it out.