On Thu, Oct 17, 2019 at 02:29:45PM +0200, Thomas Gleixner wrote: > The more I look at this trainwreck, the less interested I am in merging any > of this at all. > > The fact that it took Intel more than a year to figure out that the MSR is > per core and not per thread is yet another proof that this industry just > works by pure chance. > > There is a simple way out of this misery: > > Intel issues a microcode update which does: > > 1) Convert the OR logic of the AC enable bit in the TEST_CTRL MSR to > AND logic, i.e. when one thread disables AC it's automatically > disabled on the core. > > Alternatively it supresses the #AC when the current thread has it > disabled. > > 2) Provide a separate bit which indicates that the AC enable logic is > actually AND based or that #AC is supressed when the current thread > has it disabled. > > Which way I don't really care as long as it makes sense. The #AC bit doesn't use OR-logic, it's straight up shared, i.e. writes on one CPU are immediately visible on its sibling CPU. It doesn't magically solve the problem, but I don't think we need IPIs to coordinate between siblings, e.g. wouldn't something like this work? The per-cpu things being pointers that are shared by siblings. void split_lock_disable(void) { spinlock_t *ac_lock = this_cpu_ptr(split_lock_ac_lock); spin_lock(ac_lock); if (this_cpu_inc_return(*split_lock_ac_disabled) == 1) WRMSR(RDMSR() & ~bit); spin_unlock(ac_lock); } void split_lock_enable(void) { spinlock_t *ac_lock = this_cpu_ptr(split_lock_ac_lock); spin_lock(ac_lock); if (this_cpu_dec_return(*split_lock_ac_disabled) == 0) WRMSR(RDMSR() | bit); spin_unlock(ac_lock); } To avoid the spin_lock and WRMSR latency on every VM-Enter and VM-Exit, actions (3a) and (4a) from your matrix (copied below) could be changed to only do split_lock_disable() if the guest actually generates an #AC, and then do split_lock_enable() on the next VM-Exit. Assuming even legacy guests are somewhat sane and rarely do split-locks, lazily disabling the control would eliminate most of the overhead and would also reduce the time that the sibling CPU is running in the host without #AC protection. N | #AC | #AC enabled | SMT | Ctrl | Guest | Action R | available | on host | | exposed | #AC | --|-----------|-------------|-----|---------|-------|--------------------- | | | | | | 0 | N | x | x | N | x | None | | | | | | 1 | Y | N | x | N | x | None | | | | | | 2 | Y | Y | x | Y | Y | Forward to guest | | | | | | 3 | Y | Y | N | Y | N | A) Store in vCPU and | | | | | | toggle on VMENTER/EXIT | | | | | | | | | | | | B) SIGBUS or KVM exit code | | | | | | 4 | Y | Y | Y | Y | N | A) Disable globally on | | | | | | host. Store in vCPU/guest | | | | | | state and evtl. reenable | | | | | | when guest goes away. | | | | | | | | | | | | B) SIGBUS or KVM exit code > If that's not going to happen, then we just bury the whole thing and put it > on hold until a sane implementation of that functionality surfaces in > silicon some day in the not so foreseeable future. > > Seriously, this makes only sense when it's by default enabled and not > rendered useless by VIRT. Otherwise we never get any reports and none of > the issues are going to be fixed. > > Thanks, > > tglx