On Thu, Oct 17, 2019 at 11:31:15PM +0200, Thomas Gleixner wrote: > On Thu, 17 Oct 2019, Sean Christopherson wrote: > > On Thu, Oct 17, 2019 at 02:29:45PM +0200, Thomas Gleixner wrote: > > > The more I look at this trainwreck, the less interested I am in merging any > > > of this at all. > > > > > > The fact that it took Intel more than a year to figure out that the MSR is > > > per core and not per thread is yet another proof that this industry just > > > works by pure chance. > > > > > > There is a simple way out of this misery: > > > > > > Intel issues a microcode update which does: > > > > > > 1) Convert the OR logic of the AC enable bit in the TEST_CTRL MSR to > > > AND logic, i.e. when one thread disables AC it's automatically > > > disabled on the core. > > > > > > Alternatively it supresses the #AC when the current thread has it > > > disabled. > > > > > > 2) Provide a separate bit which indicates that the AC enable logic is > > > actually AND based or that #AC is supressed when the current thread > > > has it disabled. > > > > > > Which way I don't really care as long as it makes sense. > > > > The #AC bit doesn't use OR-logic, it's straight up shared, i.e. writes on > > one CPU are immediately visible on its sibling CPU. > > That's less horrible than I read out of your initial explanation. > > Thankfully all of this is meticulously documented in the SDM ... Preaching to the choir on this one... > Though it changes the picture radically. The truly shared MSR allows > regular software synchronization without IPIs and without an insane amount > of corner case handling. > > So as you pointed out we need a per core state, which is influenced by: > > 1) The global enablement switch > > 2) Host induced #AC > > 3) Guest induced #AC > > A) Guest has #AC handling > > B) Guest has no #AC handling > > #1: > > - OFF: #AC is globally disabled > > - ON: #AC is globally enabled > > - FORCE: same as ON but #AC is enforced on guests > > #2: > > If the host triggers an #AC then the #AC has to be force disabled on the > affected core independent of the state of #1. Nothing we can do about > that and once the initial wave of #AC issues is fixed this should not > happen on production systems. That disables #3 even for the #3.A case > for simplicity sake. > > #3: > > A) Guest has #AC handling > > #AC is forwarded to the guest. No further action required aside of > accounting > > B) Guest has no #AC handling > > If #AC triggers the resulting action depends on the state of #1: > > - FORCE: Guest is killed with SIGBUS or whatever the virt crowd > thinks is the appropriate solution > - ON: #AC triggered state is recorded per vCPU and the MSR is > toggled on VMENTER/VMEXIT in software from that point on. > > So the only interesting case is #3.B and #1.state == ON. There you need > serialization of the state and the MSR write between the cores, but only > when the vCPU triggered an #AC. Until then, nothing to do. And "vCPU triggered an #AC" should include an explicit check in KVM's emulator. > vmenter() > { > if (vcpu->ac_disable) > this_core_disable_ac(); > } > > vmexit() > { > if (vcpu->ac_disable) { > this_core_enable_ac(); > } > > this_core_dis/enable_ac() takes the global state into account and has the > necessary serialization in place. Overall, looks good to me. Although Tony's mail makes it obvious we need to sync internally...