On Tue, 2023-02-14 at 15:36 +0100, Ard Biesheuvel wrote: > On Tue, 14 Feb 2023 at 15:28, James Bottomley > <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, 2023-02-14 at 14:54 +0100, Ard Biesheuvel wrote: [...] > > > > > > Can we avoid shashes and sync skciphers at all? We have sha256 > > > and AES library routines these days, and AES in CFB mode seems > > > like a good candidate for a library implementation as well - it > > > uses AES encryption only, and is quite straight forward to > > > implement. [0] > > > > Yes, sure. I originally suggested something like this way back > > four years ago, but it got overruled on the grounds that if I > > didn't use shashes and skciphers some architectures would be unable > > to use crypto acceleration. If that's no longer a consideration, > > I'm all for simplification of static cipher types. > > I now have this all implemented, and I looked over your code, so you can add my tested/reviewed-by to the aescfb implementation. On the acceleration issue, I'm happy to ignore external accelerators because they're a huge pain for small fragments of encryption like the TPM, but it would be nice if we could integrate CPU instruction acceleration (like AES-NI on x86) into the library functions. I also got a test rig to investigate arc. It seems there is a huge problem with the SKCIPHER stack structure on that platform. For reasons I still can't fathom, the compiler thinks it needs at least 0.5k of stack for this one structure. I'm sure its something to do with an incorrect crypto alignment on arc, but I can't yet find the root cause. > I don't know if that is a consideration or not. The AES library code > is generic C code that was written to be constant-time, rather than > fast. The fact that CFB only uses the encryption side of it is > fortunate, because decryption is even slower. I think for the TPM, since the encryption isn't exactly bulk (it's really under 1k for command and response encryption) it doesn't matter ... in fact setting up the accelerator is likely a bigger overhead. > So the question is whether this will actually be a bottleneck in this > particular scenario. The synchronous accelerated AES implementations > are all SIMD based, which means there is some overhead, and some > degree of parallelism is also needed to take full advantage, and CFB > only allows this for decryption to begin with, as encryption uses > ciphertext block N-1 as AES input for encrypting block N. > > So maybe this is terrible advice, but the code will look so much > better for it, and we can always add back the performance later if it > is really an impediment. It's definitely smaller and neater, yes. I'll post a v3 based on this, but when might it go upstream? In my post I'll put your aescfb as patch 1 so the static checkers don't go haywire about missing function exports, and we can drop that patch when it is upstream. James > > > > > The crypto API is far too clunky for synchronous operations of > > > algorithms that are known at compile time, and the requirement to > > > use scatterlists for skciphers is especially horrid. > > > > > > [0] > > > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=crypto-aes-cfb-library > > > > OK, let me have a go at respinning based on this.