On Thu, Jul 20, 2023 at 07:26:04AM +0200, Pankaj Gupta wrote: > > > My understanding is that PL[0-2]_SSP are used only on transitions to the > > > corresponding privilege level from a *different* privilege level. That means > > > KVM should be able to utilize the user_return_msr framework to load the host > > > values. Though if Linux ever supports SSS, I'm guessing the core kernel will > > > have some sort of mechanism to defer loading MSR_IA32_PL0_SSP until an exit to > > > userspace, e.g. to avoid having to write PL0_SSP, which will presumably be > > > per-task, on every context switch. > > > > > > But note my original wording: **If that's necessary** > > > > > > If nothing in the host ever consumes those MSRs, i.e. if SSS is NOT enabled in > > > IA32_S_CET, then running host stuff with guest values should be ok. KVM only > > > needs to guarantee that it doesn't leak values between guests. But that should > > > Just Work, e.g. KVM should load the new vCPU's values if SHSTK is exposed to the > > > guest, and intercept (to inject #GP) if SHSTK is not exposed to the guest. > > > > > > And regardless of what the mechanism ends up managing SSP MSRs, it should only > > > ever touch PL0_SSP, because Linux never runs anything at CPL1 or CPL2, i.e. will > > > never consume PL{1,2}_SSP. > > > > To clarify, Linux will only use SSS in FRED mode -- FRED removes CPL1,2. > > Trying to understand more what prevents SSS to enable in pre FRED, Is > it better #CP exception > handling with other nested exceptions? SSS took the syscall gap and made it worse -- as in *way* worse. To top it off, the whole SSS busy bit thing is fundamentally incompatible with how we manage to survive nested exceptions in NMI context. Basically, the whole x86 exception / stack switching logic was already borderline impossible (consider taking an MCE in the early NMI path where we set up, but have not finished, the re-entrancy stuff), and pushed it over the edge and set it on fire. And NMI isn't the only problem, the various new virt exceptions #VC and #HV are on their own already near impossible, adding SSS again pushes the whole thing into clear insanity. There's a good exposition of the whole trainwreck by Andrew here: https://www.youtube.com/watch?v=qcORS8CN0ow (that is, sorry for the youtube link, but Google is failing me in finding the actual Google Doc that talk is based on, or even the slide deck :/) FRED solves all that by: - removing the stack gap, cc/ip/ss/sp/ssp/gs will all be switched atomically and consistently for every transition. - removing the non-reentrant IST mechanism and replacing it with stack levels - adding an explicit NMI latch - re-organising the actual shadow stacks and doing away with that busy bit thing (I need to re-read the FRED spec on this detail again). Crazy as we are, we're not touching legacy/IDT SSS with a ten foot pole, sorry.