On 8/27/21 11:21 AM, Borislav Petkov wrote: > On Fri, Aug 27, 2021 at 11:10:31AM -0700, Yu, Yu-cheng wrote: >> Because on context switches the whole xstates are switched together, >> we need to make sure all are in registers. > There's context switch code which does that already. > > Why would shstk_setup() be responsible for switching the whole extended > states buffer instead of only the shadow stack stuff only? I don't think this has anything to do with context-switching, really. The code lands in shstk_setup() which wants to make sure that the new MSR values are set before the task goes out to userspace. If TIF_NEED_FPU_LOAD was set, it could do that by going out to the XSAVE buffer and setting the MSR state in the buffer. Before returning to userspace, it would be XRSTOR'd. A WRMSR by itself would not be persistent because that XRSTOR would overwrite it. But, if TIF_NEED_FPU_LOAD is *clear* it means the XSAVE buffer is out-of-date and the registers are live. WRMSR can be used and there will be a XSAVE* to the task buffer during a context switch. So, this code takes the coward's way out: it *forces* TIF_NEED_FPU_LOAD to be clear by making the registers live with fpregs_restore_userregs(). That lets it just use WRMSR instead of dealing with the XSAVE buffer directly. If it didn't do this with the *WHOLE* set of user FPU state, we'd need more fine-granted "NEED_*_LOAD" tracking than our one FPU bit. This is also *only* safe because the task is newly-exec()'d and the FPU state was just reset. Otherwise, we might have had to worry that the non-PL3 SSPs have garbage or that non-SHSTK bits are set in MSR_IA32_U_CET. That said, after staring at it, I *think* this code is functionally correct and OK performance-wise. I suspect that the (very blunt) XRSTOR inside of start_update_msrs()->fpregs_restore_userregs() is quite rare because TIF_NEED_FPU_LOAD will usually be clear due to the proximity to execve(). So, adding direct XSAVE buffer manipulation would probably only make it more error prone.