On 2020-04-22 09:23:34 [+0200], Ard Biesheuvel wrote: > My memory is a bit fuzzy here. I remember talking to the linux-rt guys > about what delay is actually acceptable, which was a lot higher than I > had thought based on their initial reports about scheduling blackouts > on arm64 due to preemption remaining disabled for too long. I intended > to revisit this with more accurate bounds but then I apparently > forgot. > > So SIMD chacha20 and SIMD poly1305 both run in <5 cycles per bytes, > both on x86 and ARM. If we take 20 microseconds as a ballpark upper > bound for how long preemption may be disabled, that gives us ~4000 > bytes of ChaCha20 or Poly1305 on a hypothetical 1 GHz core. > > So I think 4 KB is indeed a reasonable quantum of work here. Only > PAGE_SIZE is not necessarily equal to 4 KB on arm64, so we should use > SZ_4K instead. > > *However*, at the time, the report was triggered by the fact that we > were keeping SIMD enabled across calls into the scatterwalk API, which > may call kmalloc()/kfree() etc. There is no need for that anymore, now > that the FPU begin/end routines all have been optimized to restore the > userland SIMD state lazily. The 20usec sound reasonable. The other concern was memory allocation within the preempt-disable section. If this is no longer the case, perfect. Sebastian