Re: [PATCH v3] KVM: x86: Use fast path for Xen timer delivery

Sean Christopherson <seanjc@xxxxxxxxxx> · Tue, 6 Feb 2024 18:58:22 -0800

On Tue, Feb 06, 2024, David Woodhouse wrote:
> On Tue, 2024-02-06 at 10:41 -0800, Sean Christopherson wrote:
> > 
> > This has an obvious-in-hindsight recursive deadlock bug.  If KVM actually needs
> > to inject a timer IRQ, and the fast path fails, i.e. the gpc is invalid,
> > kvm_xen_set_evtchn() will attempt to acquire xen.xen_lock, which is already held
> 
> Hm, right. In fact, kvm_xen_set_evtchn() shouldn't actually *need* the
> xen_lock in an ideal world; it's only taking it in order to work around
> the fact that the gfn_to_pfn_cache doesn't have its *own* self-
> sufficient locking. I have patches for that...
> 
> I think the *simplest* of the "patches for that" approaches is just to
> use the gpc->refresh_lock to cover all activate, refresh and deactivate
> calls. I was waiting for Paul's series to land before sending that one,
> but I'll work on it today, and double-check my belief that we can then
> just drop xen_lock from kvm_xen_set_evtchn().

While I definitely want to get rid of arch.xen.xen_lock, I don't want to address
the deadlock by relying on adding more locking to the gpc code.  I want a teeny
tiny patch that is easy to review and backport.  Y'all are *proably* the only
folks that care about Xen emulation, but even so, that's not a valid reason for
taking a roundabout way to fixing a deadlock.

Can't we simply not take xen_lock in kvm_xen_vcpu_get_attr()  It holds vcpu->mutex
so it's mutually exclusive with kvm_xen_vcpu_set_attr(), and I don't see any other
flows other than vCPU destruction that deactivate (or change) the gpc.

And the worst case scenario is that if _userspace_ is being stupid, userspace gets
a stale GPA.

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 4b4e738c6f1b..50aa28b9ffc4 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -973,8 +973,6 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 {
        int r = -ENOENT;
 
-       mutex_lock(&vcpu->kvm->arch.xen.xen_lock);
-
        switch (data->type) {
        case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO:
                if (vcpu->arch.xen.vcpu_info_cache.active)
@@ -1083,7 +1081,6 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
                break;
        }
 
-       mutex_unlock(&vcpu->kvm->arch.xen.xen_lock);
        return r;
 }
 
 

If that seems to risky, we could go with an ugly and hacky, but conservative:

diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 4b4e738c6f1b..456d05c5b18a 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -1052,7 +1052,9 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
                 */
                if (vcpu->arch.xen.timer_expires) {
                        hrtimer_cancel(&vcpu->arch.xen.timer);
+                       mutex_unlock(&vcpu->kvm->arch.xen.xen_lock);
                        kvm_xen_inject_timer_irqs(vcpu);
+                       mutex_lock(&vcpu->kvm->arch.xen.xen_lock);
                }
 
                data->u.timer.port = vcpu->arch.xen.timer_virq;