On Tue, Nov 28, 2023 at 12:22:26PM +0800, Jerry Shih wrote: > On Nov 28, 2023, at 11:56, Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > On Mon, Nov 27, 2023 at 03:06:54PM +0800, Jerry Shih wrote: > >> +int riscv64_aes_setkey(struct crypto_aes_ctx *ctx, const u8 *key, > >> + unsigned int keylen) > >> +{ > >> + int ret; > >> + > >> + ret = aes_check_keylen(keylen); > >> + if (ret < 0) > >> + return -EINVAL; > >> + > >> + /* > >> + * The RISC-V AES vector crypto key expanding doesn't support AES-192. > >> + * Use the generic software key expanding for that case. > >> + */ > >> + if ((keylen == 16 || keylen == 32) && crypto_simd_usable()) { > >> + /* > >> + * All zvkned-based functions use encryption expanding keys for both > >> + * encryption and decryption. > >> + */ > >> + kernel_vector_begin(); > >> + rv64i_zvkned_set_encrypt_key(key, keylen, ctx); > >> + kernel_vector_end(); > >> + } else { > >> + ret = aes_expandkey(ctx, key, keylen); > >> + } > > > > rv64i_zvkned_set_encrypt_key() does not initialize crypto_aes_ctx::key_dec. > > So, decryption results will be incorrect if !crypto_simd_usable() later. > > Will we have the situation that `crypto_simd_usable()` condition is not consistent > during the aes_setkey(), aes_enc/dec()? If yes, all accelerated(or HW specific) > crypto algorithms should do the same implementations as the sw fallback path > since the `crypto_simd_usable()` will change back and forth. Yes, the calls to one "crypto_cipher" can happen in different contexts. For example, crypto_simd_usable() can be true during setkey and false during decrypt, or vice versa. If the RISC-V decryption code wants to use the regular key schedule (key_enc) instead of the "Equivalent Inverse Cipher key schedule" (key_dec), that's perfectly fine, but setkey still needs to initialize key_dec in case the fallback to aes_decrypt() gets taken. > >> diff --git a/arch/riscv/crypto/aes-riscv64-zvkned.pl b/arch/riscv/crypto/aes-riscv64-zvkned.pl > >> new file mode 100644 > >> index 000000000000..303e82d9f6f0 > >> --- /dev/null > >> +++ b/arch/riscv/crypto/aes-riscv64-zvkned.pl > > [...] > >> +L_enc_128: > > [...] > >> +L_enc_192: > > [...] > >> +L_enc_256: > > > > There's some severe source code duplication going on in the AES assembly, with > > the three AES variants having separate source code. You can just leave this > > as-is since this is what was merged into OpenSSL and we are borrowing that for > > now, but I do expect that we'll want to clean this up later. > > Do we prefer the code with the branches instead of the specified implementation? > We could make AES-128/192/256 together like: > > @{[vaesz_vs $V24, $V1]} > @{[vaesem_vs $V24, $V2]} > @{[vaesem_vs $V24, $V3]} > @{[vaesem_vs $V24, $V4]} > @{[vaesem_vs $V24, $V5]} > @{[vaesem_vs $V24, $V6]} > @{[vaesem_vs $V24, $V7]} > @{[vaesem_vs $V24, $V8]} > @{[vaesem_vs $V24, $V9]} > @{[vaesem_vs $V24, $V10]} > beq $ROUND, $ROUND_11, 1f > @{[vaesem_vs $V24, $V11]} > @{[vaesem_vs $V24, $V12]} > beq $ROUND, $ROUND_13, 1f > @{[vaesem_vs $V24, $V13]} > @{[vaesem_vs $V24, $V14]} > 1: > @{[vaesef_vs $V24, $V15]} > > But we will have the additional costs for the branches. > That needs to be decided on a case by case basis depending on the performance impact and how much binary code is saved. On some architectures, separate binary code for AES-{128,192,256} has been found to be worthwhile. However, that does *not* mean that they need to have separate source code. Take a look at how arch/x86/crypto/aes_ctrby8_avx-x86_64.S generates code for all the AES variants using macros, for example. Anyway, I don't think you should bother making too many changes to the "perlasm" files. If we decide to make major cleanups I think we should just replace them with .S files (which already support macros). - Eric