Re: crypto: x86/crct10dif-pcl - cleanup and optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 17, 2019 at 8:23 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
>
> On Mon, Jun 17, 2019 at 11:06:21AM -0700, Nick Desaulniers wrote:
> > On Mon, Jun 17, 2019 at 6:35 AM Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > while digging through a ClangBuiltLinux issue when linking with LLD
> > > linker on x86-64 I checked the settings for...
> > >
> > > .rodata.cst16 and .rodata.cst32
> > >
> > > ...in crypto tree and fell over this change in...
> > >
> > > commit "crypto: x86/crct10dif-pcl - cleanup and optimizations":
> > >
> > > -.section .rodata.cst16.SHUF_MASK, "aM", @progbits, 16
> > > +.section .rodata.cst32.byteshift_table, "aM", @progbits, 32
> > > .align 16
> > >
> > > Is that a typo?
> > > I would have expected...
> > > .rodata.cst32.XXX -> .align 32
> > > or
> > > rodata.cst16.XXX -> .align 16
> > >
> > > But I might be wrong as I am no expert for crypto and x86/asm.
> > >
> > > Thanks in advance.
> > >
> > > Regards,
> > > - Sedat -
> > >
> > > [1] https://github.com/ClangBuiltLinux/linux/issues/431
> > > [2] https://bugs.llvm.org/show_bug.cgi?id=42289
> >
> > > [3] https://git.kernel.org/linus/0974037fc55c
> >
> > + Peter, Fangrui (who have looked at this, and started looking into
> > this from LLD's perspective)
> >
> > In fact, looking closer at that diff, the section in question
> > previously had 32b alignment.  Eric, was that change intentional?  It
> > seems funny to have a 32b entity size but a 16b alignment.
> >
> > PDF page 81 / printed page 67 of this doc:
> > https://web.eecs.umich.edu/~prabal/teaching/resources/eecs373/Assembler.pdf
> > says:
> >
> > "The linker may remove duplicates within sections with the
> > same name, same entity size and same flags. "
> >
> > So for us, LLD is NOT merging these sections due to differing
> > alignments, which is producing warnings when loading such kernel
> > modules that link against these object files.
> > --
> > Thanks,
> > ~Nick Desaulniers
>
> It was an intentional change since actually no alignment is required for this.
> It's an array of 32 bytes and the code loads 16 bytes starting from some byte
> offset in the range 1...16, so it has to use movdqu anyway.
>
> There's no problem with changing it back to 32, but I don't fully understand
> what's going on here.  Where is it documented how alignment specifiers interact
> with the section merging?  Also, there are already other mergeable sections in
> the x86 crypto code with alignment smaller than their entity size, e.g.
>
>         .rodata.cst16.aegis128_const
>         .rodata.cst16.aegis128l_const
>         .rodata.cst16.aegis256_const
>         .rodata.cst16.morus640_const
>         .rodata.cst256.K256
>
> Do those need to be changed too?
>

Hi Eric, Hi Nick,

I am building Linux v5.1.16 with a new llvm-toolchain including the fix for LLD:

"[ELF] Allow placing SHF_MERGE sections with different alignments into
the same MergeSyntheticSection"

[ Alignment=16 before my patch ]

$ cd arch/x86/crypto/
$ for o in $(ls *.o) ; do echo [ $o ] ; readelf -WS $o | grep
rodata\.cst32 ; done

[ crct10dif-pcl-asm_64.o ]
  [ 9] .rodata.cst32.byteshift_table PROGBITS        0000000000000000
0004e0 000020 20  AM  0   0 16

[ crct10dif-pclmul.o ]
  [ 9] .rodata.cst32.byteshift_table PROGBITS        0000000000000000
000b40 000020 20  AM  0   0 16

[ Alignment=32 after my patch ]

[ crct10dif-pcl-asm_64.o ]
  [ 9] .rodata.cst32.byteshift_table PROGBITS        0000000000000000
0004e0 000020 20  AM  0   0 32

[ crct10dif-pclmul.o ]
  [ 9] .rodata.cst32.byteshift_table PROGBITS        0000000000000000
000b40 000020 20  AM  0   0 32

I am still building the Linux-kernel but first checks in [3] looks good.

I can send out a separate patch if you like for the issue I have reported.

I can not say much to ...

>         .rodata.cst16.aegis128_const
>         .rodata.cst16.aegis128l_const
>         .rodata.cst16.aegis256_const
>         .rodata.cst16.morus640_const
>         .rodata.cst256.K256

... as I am not a Linker or Linux/x86/crypto specialist.

Thanks.

Regards,
- Sedat -

[1] https://bugs.llvm.org/show_bug.cgi?id=42289#c7
[2] https://reviews.llvm.org/D63432
[3] https://github.com/ClangBuiltLinux/linux/issues/431#issuecomment-508132852
From 3e6b8ff93b100b8c140a0f693b3f2f77dd7e740b Mon Sep 17 00:00:00 2001
From: Sedat Dilek <sedat.dilek@xxxxxxxxxxx>
Date: Tue, 18 Jun 2019 09:32:38 +0200
Subject: [PATCH] crypto: x86/crct10dif-pcl: Fix alignment size in
 .rodata.cst32 section

---
 arch/x86/crypto/crct10dif-pcl-asm_64.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/crypto/crct10dif-pcl-asm_64.S b/arch/x86/crypto/crct10dif-pcl-asm_64.S
index 3d873e67749d..480fcd07e973 100644
--- a/arch/x86/crypto/crct10dif-pcl-asm_64.S
+++ b/arch/x86/crypto/crct10dif-pcl-asm_64.S
@@ -322,7 +322,7 @@ ENDPROC(crc_t10dif_pcl)
 	.octa	0x000102030405060708090A0B0C0D0E0F
 
 .section	.rodata.cst32.byteshift_table, "aM", @progbits, 32
-.align 16
+.align 32
 # For 1 <= len <= 15, the 16-byte vector beginning at &byteshift_table[16 - len]
 # is the index vector to shift left by 'len' bytes, and is also {0x80, ...,
 # 0x80} XOR the index vector to shift right by '16 - len' bytes.
-- 
2.20.1


[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux