On Mon, Jan 20, 2025 at 08:18:14PM +0000, David Howells wrote: > Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > > In any case, why would you need anything to do asynchronous at all here? > > Because authenc, which I copied, passes the asynchronocity mode onto the two > algos it runs (one encrypt, one hash). If authenc is run synchronously, then > the algos are run synchronously and serially; but if authenc is run async, > then the algos are run asynchronously - but they may still have to be run > serially[*] and the second is dispatched from the completion handler of the > first. So two different paths through the code exist, and rxgk and testmgr > only test the synchronous path. No, it goes in the other direction. The underlying algorithms decide whether they are asynchronous or not, and that gets passed up. It sounds like what you want to do is test your template in the case where the underlying algorithms are asynchronous. There is a way to do that by wrapping the underlying algorithms with cryptd. For example the following works with gcm: python3 <<EOF import socket s = socket.socket(socket.AF_ALG, 5, 0) s.bind(("aead", "gcm_base(cryptd(ctr(aes-generic)),cryptd(ghash-generic))")) EOF This really should just be thought of as complying with the outdated design of the crypto API, though. In practice synchronous is the only case that really matters. > [*] Because in authenc-compatible encoding types, the output of the encryption > is hashed. Older krb5 encodings hash the plaintext and the hash generation > and the encrypt can be run in parallel. For decrypting, the reverse is true; > authenc may be able to do the decrypt and the hash in parallel... But > parallellisation also requires that the input and output buffers are not the > same. The right way to optimize cases like that is to interleave the two computations. Look at how the AES-GCM assembly code interleaves AES-CTR and GHASH for example. Doing something with async threads is the completely wrong solution here and would be much slower. The amount of time needed to process a single message is simply far too short for multithreading to be appropriate on a per message basis. - Eric