Andy, On Sat, May 29 2021 at 22:12, Andy Lutomirski wrote: > > Cc: stable@xxxxxxxxxxxxxxx > Fixes: b860eb8dce59 ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates") > Reported-by: syzbot+2067e764dbcd10721e2e@xxxxxxxxxxxxxxxxxxxxxxxxx Debugged-by ... > Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx> ... > /* > - * Clear the FPU state back to init state. > - * > - * Called by sys_execve(), by the signal handler code and by various > - * error paths. > + * Reset current's user FPU states to the init states. The caller promises > + * that current's supervisor states (in memory or CPU regs as appropriate) > + * as well as the XSAVE header in memory are intact. > */ > -static void fpu__clear(struct fpu *fpu, bool user_only) > +void fpu__clear_user_states(struct fpu *fpu) > { > WARN_ON_FPU(fpu != ¤t->thread.fpu); > > if (!static_cpu_has(X86_FEATURE_FPU)) { > - fpu__drop(fpu); > - fpu__initialize(fpu); > + fpu__clear_all(fpu); > return; > } > > fpregs_lock(); > > - if (user_only) { > - if (!fpregs_state_valid(fpu, smp_processor_id()) && > - xfeatures_mask_supervisor()) > - copy_kernel_to_xregs(&fpu->state.xsave, > - xfeatures_mask_supervisor()); > - copy_init_fpstate_to_fpregs(xfeatures_mask_user()); > - } else { > - copy_init_fpstate_to_fpregs(xfeatures_mask_all); > - } > + /* > + * Ensure that current's supervisor states are loaded into > + * their corresponding registers. > + */ > + if (!fpregs_state_valid(fpu, smp_processor_id()) && > + xfeatures_mask_supervisor()) > + copy_kernel_to_xregs(&fpu->state.xsave, > + xfeatures_mask_supervisor()); > > + /* > + * Reset user states in registers. > + */ > + copy_init_fpstate_to_fpregs(xfeatures_mask_user()); > + > + /* > + * Now all FPU registers have their desired values. Inform the > + * FPU state machine that current's FPU registers are in the > + * hardware registers. > + */ > fpregs_mark_activate(); > + > fpregs_unlock(); This is as wrong as before. The corrupted task->fpu.state still survives. For f*cks sake, I gave you a reproducer and a working patch and I explained it in great length what's broken and what needs to be fixed. And of course you kept the bug which was in the offending commit, i.e. not wiping the task->fpu.state corruption which causes the next XRSTOR to fail: [ 34.095020] Bad FPU state detected at copy_kernel_to_fpregs+0x28/0x40, reinitializing FPU registers. [ 34.095052] WARNING: CPU: 0 PID: 1364 at arch/x86/mm/extable.c:65 ex_handler_fprestore+0x5f/0x70 ... [ 34.153472] switch_fpu_return+0x40/0xb0 [ 34.154196] exit_to_user_mode_prepare+0x8f/0x180 [ 34.155060] syscall_exit_to_user_mode+0x23/0x50 [ 34.155912] do_syscall_64+0x4d/0xb0 IOW, this is exactly the same shit as we had before. So what is decent about this? Define decent... Why the heck do you think I wasted a couple of days to: - Analyze the root cause - Destill a trivial C reproducer - Come up with a fully working and completely correct fix Just because, right? I'm fine with splitting up clear_all() and clear_user(), but what you provided is as much of a clusterfuck as the commit it pretends to fix. Your's seriously grumpy Thomas