* Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > [...] To avoid irqs_disabled() entirely, we'd need to avoid disabling > softirqs, which would mean supporting nested kernel-mode FPU in > softirqs. I can sent out a patch that does that using a per-CPU > buffer, if you'd like to see that. I wasn't super happy with the > extra edge cases and memory usage, but we could go in that direction. Meh: so I just checked, and local_bh_disable()/enable() are pretty heavy these days - it's not just a simple preempt-count twiddle and a check anymore. :-/ I don't think my initial argument of irqs_disabled() overhead is really valid - and if we really cared we could halve it by saving the irqs_disabled() status at kernel_fpu_begin() time and reading it at kernel_fpu_end() time. And the alternative of having nested FPU usage and extra per-CPU FPU save areas for the kernel feels a bit fragile, even without having seen the patch. So I think I'll commit your patch to tip:x86/fpu as-is, unless someone objects. BTW., a side note, I was also reviewing the kernel_fpu_begin()/end() codepaths, and we have gems like: /* Put sane initial values into the control registers. */ if (likely(kfpu_mask & KFPU_MXCSR) && boot_cpu_has(X86_FEATURE_XMM)) ldmxcsr(MXCSR_DEFAULT); if (unlikely(kfpu_mask & KFPU_387) && boot_cpu_has(X86_FEATURE_FPU)) asm volatile ("fninit"); has the LDMXCSR instruction, or its effects, ever shown up in profiles? Because AFAICS these will execute all the time on x86-64, because: static inline void kernel_fpu_begin(void) { #ifdef CONFIG_X86_64 /* * Any 64-bit code that uses 387 instructions must explicitly request * KFPU_387. */ kernel_fpu_begin_mask(KFPU_MXCSR); And X86_FEATURE_XMM is set in pretty much every x86 CPU. Thanks, Ingo