On 2/21/19 3:50 AM, Sebastian Andrzej Siewior wrote: > diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h > index 67e4805bccb6f..05f6fce62e9f1 100644 > --- a/arch/x86/include/asm/fpu/internal.h > +++ b/arch/x86/include/asm/fpu/internal.h > @@ -562,8 +562,24 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu) > */ > static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu) > { > - if (static_cpu_has(X86_FEATURE_FPU)) > - __fpregs_load_activate(new_fpu, cpu); > + struct pkru_state *pk; > + u32 pkru_val = 0; > + > + if (!static_cpu_has(X86_FEATURE_FPU)) > + return; > + > + __fpregs_load_activate(new_fpu, cpu); This is still a bit light on comments. Maybe: /* PKRU state is switched eagerly because... */ > + if (!cpu_feature_enabled(X86_FEATURE_OSPKE)) > + return; > + > + if (current->mm) { > + pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU); > + WARN_ON_ONCE(!pk); This can trip on us of the 'init optimization' is in play because get_xsave_addr() checks xsave->header.xfeatures. That's unlikely today because we usually set PKRU to a restrictive value. But, it's also not *guaranteed*. Userspace could easily do an XRSTOR that puts PKRU back in its init state if it wanted to, then this would end up with pk==NULL. We might actually want a selftest that *does* that. I don't think we have one. > + if (pk) > + pkru_val = pk->pkru; > + }> + __write_pkru(pkru_val); > } A comment above __write_pkru() would be nice to say that it only actually does the slow instruction on changes to the value. BTW, this has the implicit behavior of always trying to do a __write_pkru(0) on switches to kernel threads. That seems a bit weird and it is likely to impose WRPKRU overhead on switches between user and kernel threads. The 0 value is also the most permissive, which is not great considering that user mm's can be active the in page tables when running kernel threads if we're being lazy. Seems like we should either leave PKRU alone or have 'init_pkru_value' be the default. That gives good security properties and is likely to match the application value, removing the WRPKRU overhead.