On Tue, Jun 1, 2021, at 11:06 AM, Dave Hansen wrote: > On to patch 3: > > > fpu__clear_all() and fpu__clear_user_states() share an implementation, > > and the resulting code is almost unreadable. Clean it up. > > Could we flesh this changelog out a bit? I basically write these in the > process of understanding what the patch does, so this is less of "your > changelog needs help" and more of, "here's some more detail if you want it": > > fpu__clear() currently resets both register state and kernel XSAVE > buffer state. It has two modes: one for all state (supervisor and user) > and another for user state only. fpu__clear_all() uses the "all state" > (user_only=0) mode, while a number of signal paths use the user_only=1 mode. > > Make fpu__clear() work only for user state (user_only=1) and remove the > "all state" (user_only=0) code. Rename it to match so it can be used by > the signal paths. > > Replace the "all state" (user_only=0) fpu__clear() functionality. Use > the TIF_NEED_FPU_LOAD functionality instead of making any actual > hardware registers changes in this path. I’ll steal some of this. > > > arch/x86/kernel/fpu/core.c | 63 +++++++++++++++++++++++++------------- > > 1 file changed, 42 insertions(+), 21 deletions(-) > > > > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c > > index 571220ac8bea..a01cbb6a08bb 100644 > > --- a/arch/x86/kernel/fpu/core.c > > +++ b/arch/x86/kernel/fpu/core.c > > @@ -354,45 +354,66 @@ static inline void copy_init_fpstate_to_fpregs(u64 features_mask) > > } > > > > /* > > - * Clear the FPU state back to init state. > > - * > > - * Called by sys_execve(), by the signal handler code and by various > > - * error paths. > > + * Reset current's user FPU states to the init states. current's supervisor > > + * states, if any, are not modified by this function. The XSTATE header > > + * in memory is assumed to be intact when this is called. > > */ > > -static void fpu__clear(struct fpu *fpu, bool user_only) > > +void fpu__clear_user_states(struct fpu *fpu) > > { > > WARN_ON_FPU(fpu != ¤t->thread.fpu); > > > > if (!static_cpu_has(X86_FEATURE_FPU)) { > > - fpu__drop(fpu); > > - fpu__initialize(fpu); > > + fpu__clear_all(fpu); > > return; > > } > > > > fpregs_lock(); > > > > - if (user_only) { > > - if (!fpregs_state_valid(fpu, smp_processor_id()) && > > - xfeatures_mask_supervisor()) > > - copy_kernel_to_xregs(&fpu->state.xsave, > > - xfeatures_mask_supervisor()); > > - copy_init_fpstate_to_fpregs(xfeatures_mask_user()); > > - } else { > > - copy_init_fpstate_to_fpregs(xfeatures_mask_all); > > - } > > + /* > > + * Ensure that current's supervisor states are loaded into > > + * their corresponding registers. > > + */ > > + if (!fpregs_state_valid(fpu, smp_processor_id()) && > > + xfeatures_mask_supervisor()) > > + copy_kernel_to_xregs(&fpu->state.xsave, > > + xfeatures_mask_supervisor()); > > > > + /* > > + * Reset user states in registers. > > + */ > > + copy_init_fpstate_to_fpregs(xfeatures_mask_user()); > > + > > + /* > > + * Now all FPU registers have their desired values. Inform the > > + * FPU state machine that current's FPU registers are in the > > + * hardware registers. > > + */ > > fpregs_mark_activate(); > > + > > fpregs_unlock(); > > } > > > > -void fpu__clear_user_states(struct fpu *fpu) > > -{ > > - fpu__clear(fpu, true); > > -} > > +/* > > + * Reset current's FPU registers (user and supervisor) to their INIT values. > > + * This is used by execve(); out of an abundance of caution, it completely > > + * wipes and resets the XSTATE buffer in memory. > > + * > > + * Note that XSAVE (unlike XSAVES) expects the XSTATE buffer in memory to > > + * be valid, so there are certain forms of corruption of the XSTATE buffer > > + * in memory that would survive initializing the FPU registers and XSAVEing > > + * them to memory. > > + * > > + * This does not change the actual hardware registers; when fpu__clear_all() > > + * returns, TIF_NEED_FPU_LOAD will be set, and a subsequent exit to user mode > > + * will reload the hardware registers from memory. > > + */ > > void fpu__clear_all(struct fpu *fpu) > > { > > - fpu__clear(fpu, false); > > + fpregs_lock(); > > + fpu__drop(fpu); > > + fpu__initialize(fpu); > > + fpregs_unlock(); > > } > > Nit: Could we move the detailed comments about TIF_NEED_FPU_LOAD right > next to the fpu__initialize() call? It would make it painfully obvious > which call is responsible. The naming isn't super helpful here. > Hmm. I was actually thinking of open-coding it all. fpu__initialize() and fpu__drop() have very few callers, and I’m not even convinced that the other callers are calling them for a valid reason.