Jason, On Wed, May 04 2022 at 17:55, Jason A. Donenfeld wrote: > On Wed, May 04, 2022 at 05:36:38PM +0200, Thomas Gleixner wrote: >> But the only use case which utilizes FPU from hard interrupt context is >> the random generator via add_randomness_...(). >> >> I did a benchmark of these functions, which invoke blake2s_update() >> three times in a row, on a SKL-X and a ZEN3. The generic code and the >> FPU accelerated code are pretty much on par vs. execution time of the >> algorithm itself plus/minus noise. >> >> IOW, using the FPU blindly for this kind of computations is not >> necessarily a good plan. I have no idea how these things are analyzed >> and evaluated if at all. Maybe the crypto people can shed some light on >> this. > > drivers/net/wireguard/{noise,cookie}.c makes pretty heavy use of BLAKE2s > in hot paths where the FPU is already being used for other algorithms, > and so there the save/restore is worth it (assuming restore finally > works lazily). In benchmarks, the SIMD code made a real difference. I'm sure there are very valid use cases, but just the two things I looked at turned out to be questionable at least. > But this presumably regards mix_pool_bytes() in the RNG. If it turns out > that supporting the FPU in hard IRQ context is a major PITA, and the > RNG Supporting FPU in hard interrupt context is possible if required and the preexisting bug which survived 10+ years has been fixed. x I just started to look into this because of that bug and due to the inconsistency between the FPU protections we have. The inconsistency comes from the hardirq requirement. > is the only thing making use of it, then sure, drop hard IRQ context > support for it. However... This may be unearthing a larger bug. > Sebastian and I put in a decent amount of work during 5.18 to remove all > calls to mix_pool_bytes() (and hence to blake2s_compress()) from > add_interrupt_randomness(). Have a look: I know. > It now accumulates in some per-CPU buffer, and then every 64 interrupts > a worker runs that does the actual mix_pool_bytes() from kthread > context. That's add_interrupt_randomness() and not affected by this. > So the question is: what is still hitting mix_pool_bytes() from hard IRQ > context? I'll investigate a bit and see. add_disk_randomness() on !RT kernels. That's what made me look into this in the first place as it unearthed the long standing FPU protection bug. See the first patch in this thread. Possibly add_device_randomness() too, but I haven't seen evidence so far. Thanks, tglx