Re: [RFC PATCH 1/2] KVM: x86/xen: Fix use-after-free in kvm_xen_eventfd_update()

David Woodhouse <dwmw2@xxxxxxxxxxxxx> · Wed, 28 Dec 2022 09:32:23 +0000

On Wed, 2022-12-28 at 01:21 +0100, Michal Luczaj wrote:
> On 12/27/22 12:11, Paolo Bonzini wrote:
> > On 12/24/22 12:14, Michal Luczaj wrote:
> > > > This lock/unlock pair can cause a deadlock because it's inside the SRCU
> > > > read side critical section.  Fortunately it's simpler to just use
> > > > mutex_lock for the whole function instead of using two small critical
> > > > sections, and then SRCU is not needed.
> > > Ah, I didn't think using a single mutex_lock would be ok. And maybe that's
> > > a silly question, but how can this lock/unlock pair cause a deadlock? My
> > > assumption was it's a*sleepable*  RCU after all.
> > > 
> > 
> > This is a pretty simple AB-BA deadlock, just involving SRCU and a mutex 
> > instead of two mutexes:
> > 
> > Thread 1                        Thread 2
> >                                 mutex_lock()
> > srcu_read_lock()
> > mutex_lock()  // stops here
> >                                 synchronize_srcu()  // stops here
> >                                 mutex_unlock()
> > mutex_unlock()
> > srcu_read_unlock()
> > 
> > Thread 2's synchronize_srcu() is waiting for srcu_read_unlock(), which 
> > however won't happen until thread 1 can take the mutex.  But the mutex 
> > is taken by thread 2, hence the deadlock.
> 
> Ahh, I see. Thank you for explaining.
> 
> Does it mean kvm_xen_hcall_evtchn_send() deadlocks in the same fashion?
> 
>                                 kvm_xen_eventfd_reset()
>                                   mutex_lock()
> srcu_read_lock()
> kvm_xen_hcall_evtchn_send()
>   kvm_xen_set_evtchn()
>     mutex_lock()
>                                   synchronize_srcu()
> 
> Michal

Good catch.

You'll note that above that mutex_lock() in kvm_xen_set_evtchn()
there's already a comment saying "if in future it will be called
directly from a vCPU thread"... which it is now.

	/*
	 * For the irqfd workqueue, using the main kvm->lock mutex is
	 * fine since this function is invoked from kvm_set_irq() with
	 * no other lock held, no srcu. In future if it will be called
	 * directly from a vCPU thread (e.g. on hypercall for an IPI)
	 * then it may need to switch to using a leaf-node mutex for
	 * serializing the shared_info mapping.
	 */
	mutex_lock(&kvm->lock);

I think that if we switch to a new lock rather than piggy-backing on
kvm->lock, we can declare that none shall call synchronize_srcu() while
holding it, and thus avoid the potential deadlock?

We have to fix the one case where the Xen code already does that, in
kvm_xen_eventfd_reset(). But fixing that one alone wouldn't solve it
for the general case and let us use kvm->lock, I presume.
Attachment:
smime.p7s

Description: S/MIME cryptographic signature