On Thu, 2 Jan 2020 at 22:09, Eneas Queiroz <cotequeiroz@xxxxxxxxx> wrote: > > I'm changing the subject title, as the original series has been merged. > > On Mon, Dec 23, 2019 at 6:46 AM Ard Biesheuvel <ard.biesheuvel@xxxxxxxxxx> wrote: >> >> On Fri, 20 Dec 2019 at 20:02, Eneas U de Queiroz <cotequeiroz@xxxxxxxxx> wrote: >> > >> > I've been trying to make the Qualcomm Crypto Engine work with GCM-mode >> > AES. I fixed some bugs, and added an option to build only hashes or >> > skciphers, as the VPN performance increases if you leave some of that to >> > the CPU. >> > >> > A discussion about this can be found here: >> > https://github.com/openwrt/openwrt/pull/2518 >> > >> > I'm using openwrt to test this, and there's no support for kernel 5.x >> > yet. So I have backported the recent skcipher updates, and tested this >> > with 4.19. I don't have the hardware with me, but I have run-tested >> > everything, working remotely. >> > >> > All of the skciphers directly implemented by the driver work. They pass >> > the tcrypt tests, and also some tests from userspace using AF_ALG: >> > https://github.com/cotequeiroz/afalg_tests >> > >> > However, I can't get gcm(aes) to work. When setting the gcm-mode key, >> > it sets the ctr(aes) key, then encrypt a block of zeroes, and uses that >> > as the ghash key. The driver fails to perform that encryption. I've >> > dumped the input and output data, and they apparently are not touched by >> > the QCE. The IV, which written to a buffer appended to the results sg >> > list gets updated, but the results themselves are not. I'm not sure >> > what goes wrong, if it is a DMA/cache problem, memory alignment, or >> > whatever. >> > >> >> This does sound like a DMA problem. I assume the accelerator is not >> cache coherent? >> >> In any case, it is dubious whether the round trip to the accelerator >> is worth it when encrypting the GHASH key. Just call aes_encrypt() >> instead, and do it in software. > > > ipsec still fails, even if I use software for every single-block operation. I can perhaps leave that as an optimization, but it won't fix the main issue. > >> > If I take 'be128 hash' out of the 'data' struct, and kzalloc them >> > separately in crypto_gcm_setkey (crypto/gcm.c), it encrypts the data >> > just fine--perhaps the payload and the request struct can't be in the >> > same page? >> > >> >> Non-cache coherent DMA involves cache invalidation on inbound data. So >> if both the device and the CPU write to the same cacheline while the >> buffer is mapped for DMA from device to memory, one of the updates >> gets lost. > > > Can you give me any pointers/examples of how I can make this work? > You could have a look at commit ed527b13d800dd515a9e6c582f0a73eca65b2e1b