Looking for opinions on LEA crypto.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hello. We have previously provided an implementation of LEA crypto as a patch via  https://lore.kernel.org/linux-crypto/20240112022859.2384-1-letrhee@xxxxxxxxx/ .

That implementation included a generic implementation of the LEA cipher, as well as SSE2, AVX2, and AVX-512F SIMD implementations available for x86-64.

We would like to hear your feedback on the patch, including any additional work needed for the patch to be accepted.

Thank you.

In addition to the explanation we included in the submitted patch, here is additional information on the utilization of LEA.

## Appendix 1. Explanation of the current position of LEAs

Describe the goal of LEA's utilization compared to the utilization of crypto already in widespread use.

- If AES or 128-bit block ciphers are already available as hardware instructions or co-processors, that is best.
- If there are enough use cases for ChaCha20+Poly1305, that may be sufficient.
- If there are enough use cases where something less than 128-bit (or 112-bit secure) is acceptable, that may be enough.

If none of the above is the case

- 128-bit block ciphers and modes of operation are required, and
- 128-bit security is required,
- The machine can fully utilize ARX (addition, rotation, XOR) operations on 32-bit integers,
- the environment requires a reduction in code size or memory usage,

LEA may be appropriate given the requirements.

## Appendix 2. Types of LEA Implementations

LEAs can be implemented in the following ways, both lightweight and fast implementations are available.

For the basic structure of a crypto, you can see Wikipedia or the paper.

  - https://en.wikipedia.org/wiki/LEA_(cipher)
  - Hong, Deukjo, et al. "LEA: A 128-bit block cipher for fast encryption on common processors.", WISA 2013.

(1) Crypto without a key schedule
  - For encryption, roundkeys can be created with just ARX without any additional space.
  - For decryption, it's enough to compute the last roundkey.
  - If it includes decryption, it requires space twice the size of the key, and a constant 8 32-bit deltas.
  - Crypto operations require at least 6 XORs, 3 additions, and 3 rotations per round, and 32-bit register moves.
  - Round-key operations require 8 rotations and 2 additions per round at 128 bits.
  - However, due to the round-key operation, it is required to be implemented separately by selecting one of 128-bit, 192-bit, or 256-bit.
(2) Generic structure with key schedule
  - In the structure in (1), all roundkeys can be precomputed and stored in memory.
  - The same code can be used for encryption and decryption without distinguishing the size of the keys.
(3) Generic structure with partial unrolling in 4 rounds
  - In the structure in (2), partial unrolling can be performed in 4-round increments to minimize register moves.
(4) Generic structure with additional storage of decryption roundkeys (submitted version)
  - In the structure in (3), we can store the decryption roundkey separately to optimize the decryption process.
  - This is the version we submitted as `lea_generic.c`.
(5) Structure with precomputed key scheduling process
  - In (2), we can precompute the delta required for the round key.
(6) SIMD implementation version (submitted)
  - An implementation of (3) or (4) can be implemented in SIMD.
  - We use 32x4, 32x8, and 32x16 registers and the ARX instruction.
  - This is the version we submitted as `lea_x86_64.c`.




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux