Re: [PATCH crypto-stable] crypto: arch/lib - limit simd usage to PAGE_SIZE chunks

Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> · Wed, 22 Apr 2020 13:28:31 +0200

On 2020-04-22 09:23:34 [+0200], Ard Biesheuvel wrote:
> My memory is a bit fuzzy here. I remember talking to the linux-rt guys
> about what delay is actually acceptable, which was a lot higher than I
> had thought based on their initial reports about scheduling blackouts
> on arm64 due to preemption remaining disabled for too long. I intended
> to revisit this with more accurate bounds but then I apparently
> forgot.
> 
> So SIMD chacha20 and SIMD poly1305 both run in <5 cycles per bytes,
> both on x86 and ARM. If we take 20 microseconds as a ballpark upper
> bound for how long preemption may be disabled, that gives us ~4000
> bytes of ChaCha20 or Poly1305 on a hypothetical 1 GHz core.
> 
> So I think 4 KB is indeed a reasonable quantum of work here. Only
> PAGE_SIZE is not necessarily equal to 4 KB on arm64, so we should use
> SZ_4K instead.
> 
> *However*, at the time, the report was triggered by the fact that we
> were keeping SIMD enabled across calls into the scatterwalk API, which
> may call kmalloc()/kfree() etc. There is no need for that anymore, now
> that the FPU begin/end routines all have been optimized to restore the
> userland SIMD state lazily.

The 20usec sound reasonable. The other concern was memory allocation
within the preempt-disable section. If this is no longer the case,
perfect.

Sebastian