> On Sep 27, 2018, at 8:19 AM, Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: > > Hey again Thomas, > >> On Thu, Sep 27, 2018 at 3:26 PM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote: >> >> Hi Thomas, >> >> I'm trying to optimize this for crypto performance while still taking >> into account preemption concerns. I'm having a bit of trouble figuring >> out a way to determine numerically what the upper bounds for this >> stuff looks like. I'm sure I could pick a pretty sane number that's >> arguably okay -- and way under the limit -- but I still am interested >> in determining what that limit actually is. I was hoping there'd be a >> debugging option called, "warn if preemption is disabled for too >> long", or something, but I couldn't find anything like that. I'm also >> not quite sure what the latency limits are, to just compute this with >> a formula. Essentially what I'm trying to determine is: >> >> preempt_disable(); >> asm volatile(".fill N, 1, 0x90;"); >> preempt_enable(); >> >> What is the maximum value of N for which the above is okay? What >> technique would you generally use in measuring this? >> >> Thanks, >> Jason > > From talking to Peter (now CC'd) on IRC, it sounds like what you're > mostly interested in is clocktime latency on reasonable hardware, with > a goal of around ~20µs as a maximum upper bound? I don't expect to get > anywhere near this value at all, but if you can confirm that's a > decent ballpark, it would make for some interesting calculations. > > I would add another consideration: if you can get better latency with negligible overhead (0.1%? 0.05%), then that might make sense too. For example, it seems plausible that checking need_resched() every few blocks adds basically no overhead, and the SIMD helpers could do this themselves or perhaps only ever do a block at a time. need_resched() costs a cacheline access, but it’s usually a hot cacheline, and the actual check is just whether a certain bit in memory is set.