mn Fri, Jun 30, 2023, Chao Gao wrote: > On Fri, Jun 30, 2023 at 03:26:12PM +0800, Qi Ai wrote: > >+ !is_protmode(vcpu)) > >+ kvm_x86_ops.clear_hlt(vcpu); > > Use static_call_cond(kvm_x86_clear_hlt)(vcpu) instead. > > It looks incorrect that we add this side-effect heuristically here. I Yeah, adding heuristics to KVM_SET_REGS isn't happening. KVM's existing hack for "Older userspace" in __set_sregs_common() is bad enough: /* Older userspace won't unhalt the vcpu on reset. */ if (kvm_vcpu_is_bsp(vcpu) && kvm_rip_read(vcpu) == 0xfff0 && sregs->cs.selector == 0xf000 && sregs->cs.base == 0xffff0000 && !is_protmode(vcpu)) vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; > am wondering if we can link vcpu->arch.mp_state to VMCS activity state, Hrm, maybe. > i.e., when mp_state is set to RUNNABLE in KVM_SET_MP_STATE ioctl, KVM > sets VMCS activity state to active. Not in the ioctl(), there needs to be a proper set of APIs, e.g. so that the existing hack works, and so that KVM actually reports out to userspace that a vCPU is HALTED if userspace gained control of the vCPU, e.g. after an IRQ exit, while the vCPU was HALTED. I.e. mp_state versus vmcs.ACTIVITY_STATE needs to be bidirectional, not one-way. E.g. if a vCPU is live migrated, I'm pretty sure vmcs.ACTIVITY_STATE is lost, which is wrong. The downside is that if KVM propagates vmcs.ACTIVITY_STATE to mp_state for the halted case, then KVM will enter kvm_vcpu_halt() instead of entering the guest in halted state, which is undesirable. Hmm, that can be handled by treating the vCPU as running, e.g. static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu) { return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE || (vcpu->arch.mp_state == KVM_MP_STATE_HALTED && kvm_hlt_in_guest(vcpu->kvm))) && !vcpu->arch.apf.halted); } but that would have cascading effect to a whole pile of things. I don't *think* they'd be used with kvm_hlt_in_guest(), but we've had weirder stuff. I'm half tempted to solve this particular issue by stuffing vmcs.ACTIVITY_STATE on shutdown, similar to what SVM does on shutdown interception. KVM doesn't come anywhere near faithfully emulating shutdown, so it's unlikely to break anything. And then the mp_state vs. hlt_in_guest coulbe tackled separately. Ugh, but that wouldn't cover a synthesized KVM_REQ_TRIPLE_FAULT. diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 44fb619803b8..ee4bb37067d1 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5312,6 +5312,8 @@ static __always_inline int handle_external_interrupt(struct kvm_vcpu *vcpu) static int handle_triple_fault(struct kvm_vcpu *vcpu) { + vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE); + vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; vcpu->mmio_needed = 0; return 0; I don't suppose QEMU can to blast INIT at all vCPUs for this case?