On Fri, 20 Dec 2019 at 20:02, Eneas U de Queiroz <cotequeiroz@xxxxxxxxx> wrote: > > I've been trying to make the Qualcomm Crypto Engine work with GCM-mode > AES. I fixed some bugs, and added an option to build only hashes or > skciphers, as the VPN performance increases if you leave some of that to > the CPU. > > A discussion about this can be found here: > https://github.com/openwrt/openwrt/pull/2518 > > I'm using openwrt to test this, and there's no support for kernel 5.x > yet. So I have backported the recent skcipher updates, and tested this > with 4.19. I don't have the hardware with me, but I have run-tested > everything, working remotely. > > All of the skciphers directly implemented by the driver work. They pass > the tcrypt tests, and also some tests from userspace using AF_ALG: > https://github.com/cotequeiroz/afalg_tests > > However, I can't get gcm(aes) to work. When setting the gcm-mode key, > it sets the ctr(aes) key, then encrypt a block of zeroes, and uses that > as the ghash key. The driver fails to perform that encryption. I've > dumped the input and output data, and they apparently are not touched by > the QCE. The IV, which written to a buffer appended to the results sg > list gets updated, but the results themselves are not. I'm not sure > what goes wrong, if it is a DMA/cache problem, memory alignment, or > whatever. > This does sound like a DMA problem. I assume the accelerator is not cache coherent? In any case, it is dubious whether the round trip to the accelerator is worth it when encrypting the GHASH key. Just call aes_encrypt() instead, and do it in software. > If I take 'be128 hash' out of the 'data' struct, and kzalloc them > separately in crypto_gcm_setkey (crypto/gcm.c), it encrypts the data > just fine--perhaps the payload and the request struct can't be in the > same page? > Non-cache coherent DMA involves cache invalidation on inbound data. So if both the device and the CPU write to the same cacheline while the buffer is mapped for DMA from device to memory, one of the updates gets lost. > However, it still fails during decryption of the very first tcrypt test > vector (I'm testing with the AF_ALG program, using the same vectors as > the kernel), in the final encryption to compute the authentication tag, > in the same fashion as it did in 'crypto_gcm_setkey'. What this case > has in common with the ghash key above is the input data, a single block > of zeroes, so this may be a hardware bug. However, if I perform the > same encryption using the AF_ALG test, it completes OK. > > I am not experienced enough with the Linux Kernel, or with the ARM > architecture to wrap this up on my own, so I need some pointers to what > to try next. > > To come up a working setup, I am passing any AES requests whose length > is less than or equal to one AES block to the fallback skcipher. This > hack is not a part of this series, but I can send it if there's any > interest in it. > > Anyway, the patches in this series are complete enough on their own. > With the exception of the last patch, they're all bugfixes. > > Cheers, > > Eneas > > Eneas U de Queiroz (6): > crypto: qce - fix ctr-aes-qce block, chunk sizes > crypto: qce - fix xts-aes-qce key sizes > crypto: qce - save a sg table slot for result buf > crypto: qce - update the skcipher IV > crypto: qce - initialize fallback only for AES > crypto: qce - allow building only hashes/ciphers > > drivers/crypto/Kconfig | 63 ++++++++- > drivers/crypto/qce/Makefile | 7 +- > drivers/crypto/qce/common.c | 244 ++++++++++++++++++---------------- > drivers/crypto/qce/core.c | 4 + > drivers/crypto/qce/dma.c | 6 +- > drivers/crypto/qce/dma.h | 3 +- > drivers/crypto/qce/skcipher.c | 41 ++++-- > 7 files changed, 229 insertions(+), 139 deletions(-) >