On Wed, Apr 22, 2020 at 5:28 AM Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote: > > On 2020-04-22 09:23:34 [+0200], Ard Biesheuvel wrote: > > My memory is a bit fuzzy here. I remember talking to the linux-rt guys > > about what delay is actually acceptable, which was a lot higher than I > > had thought based on their initial reports about scheduling blackouts > > on arm64 due to preemption remaining disabled for too long. I intended > > to revisit this with more accurate bounds but then I apparently > > forgot. > > > > So SIMD chacha20 and SIMD poly1305 both run in <5 cycles per bytes, > > both on x86 and ARM. If we take 20 microseconds as a ballpark upper > > bound for how long preemption may be disabled, that gives us ~4000 > > bytes of ChaCha20 or Poly1305 on a hypothetical 1 GHz core. > > > > So I think 4 KB is indeed a reasonable quantum of work here. Only > > PAGE_SIZE is not necessarily equal to 4 KB on arm64, so we should use > > SZ_4K instead. > > > > *However*, at the time, the report was triggered by the fact that we > > were keeping SIMD enabled across calls into the scatterwalk API, which > > may call kmalloc()/kfree() etc. There is no need for that anymore, now > > that the FPU begin/end routines all have been optimized to restore the > > userland SIMD state lazily. > > The 20usec sound reasonable. The other concern was memory allocation > within the preempt-disable section. If this is no longer the case, > perfect. Cool, thanks for the confirmation. I'll get a v2 of this patch out the door.