On 5/8/24 17:29, Chang S. Bae wrote: > +void kernel_fpu_reset(void) > +{ > + kernel_fpu_begin(); > + if (cpu_feature_enabled(X86_FEATURE_AMX_TILE)) > + tile_release(); > + kernel_fpu_end(); > +} > +EXPORT_SYMBOL(kernel_fpu_reset); > + ... > --- a/drivers/platform/x86/intel/ifs/runtest.c > +++ b/drivers/platform/x86/intel/ifs/runtest.c > @@ -188,6 +188,8 @@ static int doscan(void *data) > /* Only the first logical CPU on a core reports result */ > first = cpumask_first(cpu_smt_mask(cpu)); > > + kernel_fpu_reset(); > + > wait_for_sibling_cpu(&scan_cpus_in, NSEC_PER_SEC); Remember, kernel_fpu_begin/end() mark a section of code that needs the FPU. Once code calls kernel_fpu_end(), it no longer owns the FPU and all bets are off. A interrupt could theoretically come in and do whatever it wants. I _assume_ that this is practically impossible since the stop_machine() infrastructure keeps interrupts at bay. But it's rather subtle. I'd probably just do this: + kernel_fpu_begin(); + // AMX *MUST* be in the init state for the wrmsr() to work. + // But, the more in the init state, the less state the test + // has to save and restore. Just zap everything. + restore_fpregs_from_fpstate(&init_fpstate, + fpu_user_cfg.max_features); + wrmsrl(MSR_ACTIVATE_SCAN, params->activate->data); rdmsrl(MSR_SCAN_STATUS, status.data); + kernel_fpu_end(); That's dirt simple. It doesn't require new infrastructure. It doesn't call an opaque new helper. It doesn't require a feature check. It probably makes the IFS test run faster. It will also magically work for any fancy new feature that comes along which *ALSO* needs to be in its init state ... with zero changes to this code. For bonus points, this code is quite universal. It will work, as-is, in a bunch of kernel contexts if future deranged kernel developer copies and pastes it. The code you suggested above can race unless it's called under stop_machine() and isn't safe to copy elsewhere. Three lines of code: 1. IFS declares its need to own the FPU for a moment, like any other kernel_fpu_begin() user. It's not a special snowflake. It is boring. 2. IFS zaps the FPU state 3. IFS gives up the FPU Am I out of my mind? What am I missing? Why bother with _anything_ more complicated than this?