Re: [PATCH v2] crypto: aesni - add ccm(aes) algorithm implementation

Ard Biesheuvel <ardb@xxxxxxxxxx> · Tue, 15 Dec 2020 09:55:37 +0100

(+ Eric)

TL;DR can we find a way to use synchronous SIMD skciphers/aeads
without cryptd or scalar fallbacks

On Thu, 10 Dec 2020 at 13:19, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Thu, 10 Dec 2020 at 13:16, Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Thu, Dec 10, 2020 at 01:03:56PM +0100, Ard Biesheuvel wrote:
> > >
> > > But we should probably start policing this a bit more. For instance, we now have
> > >
> > > drivers/net/macsec.c:
> > >
> > > /* Pick a sync gcm(aes) cipher to ensure order is preserved. */
> > > tfm = crypto_alloc_aead("gcm(aes)", 0, CRYPTO_ALG_ASYNC);
> > >
> > > (btw the comment is bogus, right?)
> > >
> > > TLS_SW does the same thing in net/tls/tls_device_fallback.c.
> >
> > Short of us volunteering to write code for every user out there
> > I don't see a way out.
> >
> > > Async is obviously needed for h/w accelerators, but could we perhaps
> > > do better for s/w SIMD algorithms? Those are by far the most widely
> > > used ones.
> >
> > If you can come up with a way that avoids the cryptd model without
> > using a fallback obviously that would be the ultimate solution.
> >
>
> Could we disable softirq handling in these regions?

I have been looking into this a bit, and I wonder if we might consider
doing the following:
- forbid synchronous skcipher/aead encrypt/decrypt calls from any
other context than task or softirq (insofar this is not already the
case)
- limit kernel mode SIMD in general to task or softirq context
- reduce the scope for simd begin/end blocks, which is better for
PREEMPT in any case, and no longer results in a performance hit on x86
as it did before, now that we have lazy restore for the userland FPU
state
- disable softirq processing when enabling kernel mode SIMD

This way, we don't need a scalar fallback at all, given that any SIMD
use in softirq context is guaranteed to occur when the SIMD registers
are dead from the task's pov.

So the question is then how granular these kernel mode SIMD regions
need to be to avoid excessive latencies in softirq handling.

I think this could also be an opportunity for a bit more alignment
between architectures on this topic.

-- 
Ard.