On 09/12/2016 08:03 PM, Paolo Bonzini wrote: > > > On 12/09/2016 19:37, Christian Borntraeger wrote: >> On 09/12/2016 06:44 PM, Paolo Bonzini wrote: >>> I think that two CPUs doing reciprocal SIGPs could in principle end up >>> waiting on each other to complete their run_on_cpu. If the SIGP has to >>> be synchronous the fix is not trivial (you'd have to put the CPU in a >>> state similar to cpu->halted = 1), otherwise it's enough to replace >>> run_on_cpu with async_run_on_cpu. >> >> IIRC the sigps are supossed to be serialized by the big QEMU lock. WIll >> have a look. > > Yes, but run_on_cpu drops it when it waits on the qemu_work_cond > condition variable. (Related: I stumbled upon it because I wanted to > remove the BQL from run_on_cpu work items). Yes, seems you are right. If both CPUs have just exited from KVM doing a crossover sigp, they will do the arch_exit handling before the run_on_cpu stuff which might result in this hang. (luckily it seems quite unlikely but still we need to fix it). We cannot simply use async as the callbacks also provide the condition code for the initiater, so this requires some rework. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html