Re: [patch 3/8] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer

Borislav Petkov <bp@xxxxxxxxx> · Thu, 3 Jun 2021 21:28:56 +0200

On Thu, Jun 03, 2021 at 10:30:05AM -0700, Andy Lutomirski wrote:
> Think "complex microarchitectural conditions".

Ah, the magic phrase.

> How about:
> 
> As far as I can tell, both Intel and AMD consider it to be
> architecturally valid for XRSTOR to fail with #PF but nonetheless change
> user state.  The actual conditions under which this might occur are
> unclear [1], but it seems plausible that this might be triggered if one
> sibling thread unmaps a page and invalidates the shared TLB while
> another sibling thread is executing XRSTOR on the page in question.
> 
> __fpu__restore_sig() can execute XRSTOR while the hardware registers are
> preserved on behalf of a different victim task (using the
> fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but
> modify the registers.  If this happens, then there is a window in which
> __fpu__restore_sig() could schedule out and the victim task could
> schedule back in without reloading its own FPU registers.  This would
> result in part of the FPU state that __fpu__restore_sig() was attempting
> to load leaking into the victim task's user-visible state.
> 
> Invalidate preserved FPU registers on XRSTOR failure to prevent this
> situation from corrupting any state.
> 
> [1] Frequent readers of the errata lists might imagine "complex
> microarchitectural conditions".

Yap, very nice, thanks!

> > I'm wondering if that comment can simply be above the TIF_NEED_FPU_LOAD
> > testing, standalone, instead of having it in an empty else? And then get
> > rid of that else.
> 
> I'm fine either way.

Ok, then let's aim for common, no-surprise-there patterns as we're in a
mine field here anyway.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette