Re: [PATCH 0/5] crypto: Implement cmac based on cbc skcipher

Ard Biesheuvel <ardb@xxxxxxxxxx> · Thu, 20 Aug 2020 09:48:02 +0200

On Thu, 20 Aug 2020 at 09:44, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Aug 20, 2020 at 09:33:21AM +0200, Ard Biesheuvel wrote:
> >
> > > On my machine the performance difference on a 1472-byte request
> > > between SIMD and generic is 2161 vs. 7558 (cycles).
> >
> > Sure. But your machine does not have the pathological FPU
> > preserve/restore performance.
>
> Why does that matter? These are numbers for cbc-aesni which means
> just a single preserve/restore for the whole request.
>

No, that is the whole problem. The CCM template has a CBCMAC
implementation that wraps the bare cipher, which means it invokes
crypto_cipher_encrypt_one() for each 16 bytes of input, and each of
those calls involves a FPU preserve/restore.

> Or are you saying on Ben's machine cbc-aesni would have worse
> performance vs. aes-generic?
>

Yes, given the pathological overhead of FPU preserve/restore for every
block of 16 bytes processed by the cbcmac wrapper.

> > The mac80211 CCMP code uses a synchronous ccm aead, which gets backed
> > by a skcipher+ahash combo by the ccm template. So a synchronous ahash
> > is fine for this particular case.
>
> OK I was just grepping for cmac so didn't see this.
>
> For this case, I think it's even more important that it be converted
> over to async because its sending path is also in user context just
> like IPsec.
>

Indeed.

cmac() is not really relevant for performance, afaict. Only cbcmac()
is used for bulk data.

> So simply by sending wireless packets you can hog the CPU while
> doing SIMD in kernel context which would then kill the receive
> path if you're using the generic fallback.
>
> Cheers,
> --
> Email: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt