On Wed, Oct 16, 2019 at 09:23:37AM -0700, Sean Christopherson wrote: > On Wed, Oct 16, 2019 at 05:43:53PM +0200, Paolo Bonzini wrote: > > On 16/10/19 17:41, Sean Christopherson wrote: > > > On Wed, Oct 16, 2019 at 04:08:14PM +0200, Paolo Bonzini wrote: > > >> SIGBUS (actually a new KVM_EXIT_INTERNAL_ERROR result from KVM_RUN is > > >> better, but that's the idea) is for when you're debugging guests. > > >> Global disable (or alternatively, disable SMT) is for production use. > > > > > > Alternatively, for guests without split-lock #AC enabled, what if KVM were > > > to emulate the faulting instruction with split-lock detection temporarily > > > disabled? > > > > Yes we can get fancy, but remember that KVM is not yet supporting > > emulation of locked instructions. Adding it is possible but shouldn't > > be in the critical path for the whole feature. > > Ah, didn't realize that. I'm surprised emulating all locks with cmpxchg > doesn't cause problems (or am I misreading the code?). Assuming I'm > reading the code correctly, the #AC path could kick all other vCPUS on > emulation failure and then retry emulation to "guarantee" success. Though > that's starting to build quite the house of cards. Ugh, doesn't the existing emulation behavior create another KVM issue? KVM uses a locked cmpxchg in emulator_cmpxchg_emulated() and the address is guest controlled, e.g. a guest could coerce the host into disabling split-lock detection via the host's #AC handler by triggering emulation and inducing an #AC in the emulator. > > How would you disable split-lock detection temporarily? Just tweak > > MSR_TEST_CTRL for the time of running the one instruction, and cross > > fingers that the sibling doesn't notice? > > Tweak MSR_TEST_CTRL, with logic to handle the scenario where split-lock > detection is globally disable during emulation (so KVM doesn't > inadvertantly re-enable it). > > There isn't much for the sibling to notice. The kernel would temporarily > allow split-locks on the sibling, but that's a performance issue and isn't > directly fatal. A missed #AC in the host kernel would only delay the > inevitable global disabling of split-lock. A missed #AC in userspace would > again just delay the inevitable SIGBUS.