On 01/10/2018 10:57, Jan Beulich wrote: >>>> On 01.10.18 at 09:16, <jgross@xxxxxxxx> wrote: >> xen_qlock_wait() isn't safe for nested calls due to interrupts. A call >> of xen_qlock_kick() might be ignored in case a deeper nesting level >> was active right before the call of xen_poll_irq(): >> >> CPU 1: CPU 2: >> spin_lock(lock1) >> spin_lock(lock1) >> -> xen_qlock_wait() >> -> xen_clear_irq_pending() >> Interrupt happens >> spin_unlock(lock1) >> -> xen_qlock_kick(CPU 2) >> spin_lock_irqsave(lock2) >> spin_lock_irqsave(lock2) >> -> xen_qlock_wait() >> -> xen_clear_irq_pending() >> clears kick for lock1 >> -> xen_poll_irq() >> spin_unlock_irq_restore(lock2) >> -> xen_qlock_kick(CPU 2) >> wakes up >> spin_unlock_irq_restore(lock2) >> IRET >> resumes in xen_qlock_wait() >> -> xen_poll_irq() >> never wakes up >> >> The solution is to disable interrupts in xen_qlock_wait() and not to >> poll for the irq in case xen_qlock_wait() is called in nmi context. > > Are precautions against NMI really worthwhile? Locks acquired both > in NMI context as well as outside of it are liable to deadlock anyway, > aren't they? The locks don't need to be the same. A NMI-only lock tried to be acquired with xen_qlock_wait() for another lock having been interrupted by the NMI will be enough to risk the issue. So yes, I believe the test for NMI is good to have. Juergen