Re: [RFD] x86/split_lock: Request to Intel

Sean Christopherson <sean.j.christopherson@xxxxxxxxx> · Thu, 17 Oct 2019 10:23:12 -0700

On Thu, Oct 17, 2019 at 02:29:45PM +0200, Thomas Gleixner wrote:
> The more I look at this trainwreck, the less interested I am in merging any
> of this at all.
> 
> The fact that it took Intel more than a year to figure out that the MSR is
> per core and not per thread is yet another proof that this industry just
> works by pure chance.
> 
> There is a simple way out of this misery:
> 
>   Intel issues a microcode update which does:
> 
>     1) Convert the OR logic of the AC enable bit in the TEST_CTRL MSR to
>        AND logic, i.e. when one thread disables AC it's automatically
>        disabled on the core.
> 
>        Alternatively it supresses the #AC when the current thread has it
>        disabled.
> 
>     2) Provide a separate bit which indicates that the AC enable logic is
>        actually AND based or that #AC is supressed when the current thread
>        has it disabled.
> 
>     Which way I don't really care as long as it makes sense.

The #AC bit doesn't use OR-logic, it's straight up shared, i.e. writes on
one CPU are immediately visible on its sibling CPU.  It doesn't magically
solve the problem, but I don't think we need IPIs to coordinate between
siblings, e.g. wouldn't something like this work?  The per-cpu things
being pointers that are shared by siblings.

void split_lock_disable(void)
{
        spinlock_t *ac_lock = this_cpu_ptr(split_lock_ac_lock);

	spin_lock(ac_lock);
        if (this_cpu_inc_return(*split_lock_ac_disabled) == 1)
                WRMSR(RDMSR() & ~bit);
        spin_unlock(ac_lock);
}

void split_lock_enable(void)
{
        spinlock_t *ac_lock = this_cpu_ptr(split_lock_ac_lock);

	spin_lock(ac_lock);
        if (this_cpu_dec_return(*split_lock_ac_disabled) == 0)
                WRMSR(RDMSR() | bit);
        spin_unlock(ac_lock);
}

To avoid the spin_lock and WRMSR latency on every VM-Enter and VM-Exit,
actions (3a) and (4a) from your matrix (copied below) could be changed to
only do split_lock_disable() if the guest actually generates an #AC, and
then do split_lock_enable() on the next VM-Exit.  Assuming even legacy
guests are somewhat sane and rarely do split-locks, lazily disabling the
control would eliminate most of the overhead and would also reduce the
time that the sibling CPU is running in the host without #AC protection.

N | #AC       | #AC enabled | SMT | Ctrl    | Guest | Action
R | available | on host     |     | exposed | #AC   |
--|-----------|-------------|-----|---------|-------|---------------------
  |           |             |     |         |       |
0 | N         |     x       |  x  |   N     |   x   | None
  |           |             |     |         |       |
1 | Y         |     N       |  x  |   N     |   x   | None
  |           |             |     |         |       |
2 | Y         |     Y       |  x  |   Y     |   Y   | Forward to guest
  |           |             |     |         |       |
3 | Y         |     Y       |  N  |   Y     |   N   | A) Store in vCPU and
  |           |             |     |         |       |    toggle on VMENTER/EXIT
  |           |             |     |         |       |
  |           |             |     |         |       | B) SIGBUS or KVM exit code
  |           |             |     |         |       |
4 | Y         |     Y       |  Y  |   Y     |   N   | A) Disable globally on
  |           |             |     |         |       |    host. Store in vCPU/guest
  |           |             |     |         |       |    state and evtl. reenable
  |           |             |     |         |       |    when guest goes away.
  |           |             |     |         |       | 
  |           |             |     |         |       | B) SIGBUS or KVM exit code

> If that's not going to happen, then we just bury the whole thing and put it
> on hold until a sane implementation of that functionality surfaces in
> silicon some day in the not so foreseeable future.
> 
> Seriously, this makes only sense when it's by default enabled and not
> rendered useless by VIRT. Otherwise we never get any reports and none of
> the issues are going to be fixed.
> 
> Thanks,
> 
> 	tglx