On Thu, Jun 03, 2021 at 10:30:05AM -0700, Andy Lutomirski wrote: > Think "complex microarchitectural conditions". Ah, the magic phrase. > How about: > > As far as I can tell, both Intel and AMD consider it to be > architecturally valid for XRSTOR to fail with #PF but nonetheless change > user state. The actual conditions under which this might occur are > unclear [1], but it seems plausible that this might be triggered if one > sibling thread unmaps a page and invalidates the shared TLB while > another sibling thread is executing XRSTOR on the page in question. > > __fpu__restore_sig() can execute XRSTOR while the hardware registers are > preserved on behalf of a different victim task (using the > fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but > modify the registers. If this happens, then there is a window in which > __fpu__restore_sig() could schedule out and the victim task could > schedule back in without reloading its own FPU registers. This would > result in part of the FPU state that __fpu__restore_sig() was attempting > to load leaking into the victim task's user-visible state. > > Invalidate preserved FPU registers on XRSTOR failure to prevent this > situation from corrupting any state. > > [1] Frequent readers of the errata lists might imagine "complex > microarchitectural conditions". Yap, very nice, thanks! > > I'm wondering if that comment can simply be above the TIF_NEED_FPU_LOAD > > testing, standalone, instead of having it in an empty else? And then get > > rid of that else. > > I'm fine either way. Ok, then let's aim for common, no-surprise-there patterns as we're in a mine field here anyway. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette