Re: [PATCH v3 0/3] crypto: crct10dif assembly cleanup and optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 30 Jan 2019 at 04:15, Eric Biggers <ebiggers@xxxxxxxxxx> wrote:
>
> The x86, arm, and arm64 asm implementations of crct10dif are very
> difficult to understand partly because many of the comments, labels, and
> macros are named incorrectly: the lengths mentioned are usually off by a
> factor of two from the actual code.  Many other things are unnecessarily
> convoluted as well, e.g. there are many more fold constants than
> actually needed and some aren't fully reduced.
>
> This series therefore cleans up all these implementations to be much
> more maintainable.  I also made some small optimizations where I saw
> opportunities, resulting in slightly better performance.
>
> This is based on top of the pending patches from Ard Biesheuvel.
>
> These all pass the new extra self-tests.
>

Hi Eric,

As a FYI, the issue that broke ARM and arm64 with your updated
selftests was the 1 byte length special case that you also have
special handling for in the x86 version (but while fixing that, I
noticed my version was reading beyond the end of the input). I think
it hardly matters, though, given the way T10-DIF appears to be used in
practice (disk blocks), although it is hard to be sure from reading
the code, and the algo should be correct in any case.

So what remains is the way these implementations are encapsulated by
the crct10dif() library function, which is raster nasty, making
CRC-T10DIF an excellent use case to discuss whether we can make any
improvements to address some of the concerns that were also raised in
the zinc discussion. I threw some code together a while ago [0] (and
posted it as well, IIRC). In the mean time, a 'static call'
infrastructure is being proposed that could be used in a similar way
to avoid function pointers. I'm also interested in hearing opinions on
whether the indirect call overhead is actually significant in use
cases such as this one.

-- 
Ard.



[0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=crypto-crct10dif


> Changed since v2:
> - Removed the unnecessary '__LINUX_ARM_ARCH__ < 7' case.
> - Added Ard's Acked-by.
>
> Changed since v1:
> - Moved constants in arm implementation to .rodata.
> - Eliminated a few instructions from the x86 implementation.
> - Tweaked a few comments.
>
> Eric Biggers (3):
>   crypto: x86/crct10dif-pcl - cleanup and optimizations
>   crypto: arm/crct10dif-ce - cleanup and optimizations
>   crypto: arm64/crct10dif-ce - cleanup and optimizations
>
>  arch/arm/crypto/crct10dif-ce-core.S     | 552 ++++++++--------
>  arch/arm/crypto/crct10dif-ce-glue.c     |   2 +-
>  arch/arm64/crypto/crct10dif-ce-core.S   | 496 +++++++-------
>  arch/arm64/crypto/crct10dif-ce-glue.c   |   4 +-
>  arch/x86/crypto/crct10dif-pcl-asm_64.S  | 844 +++++++++---------------
>  arch/x86/crypto/crct10dif-pclmul_glue.c |   3 +-
>  6 files changed, 794 insertions(+), 1107 deletions(-)
>
> --
> 2.20.1
>



[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]

  Powered by Linux