> On 09/12/2016 08:03 PM, Paolo Bonzini wrote: > > > > > > On 12/09/2016 19:37, Christian Borntraeger wrote: > >> On 09/12/2016 06:44 PM, Paolo Bonzini wrote: > >>> I think that two CPUs doing reciprocal SIGPs could in principle end up > >>> waiting on each other to complete their run_on_cpu. If the SIGP has to > >>> be synchronous the fix is not trivial (you'd have to put the CPU in a > >>> state similar to cpu->halted = 1), otherwise it's enough to replace > >>> run_on_cpu with async_run_on_cpu. > >> > >> IIRC the sigps are supossed to be serialized by the big QEMU lock. WIll > >> have a look. > > > > Yes, but run_on_cpu drops it when it waits on the qemu_work_cond > > condition variable. (Related: I stumbled upon it because I wanted to > > remove the BQL from run_on_cpu work items). > > Yes, seems you are right. If both CPUs have just exited from KVM doing a > crossover sigp, they will do the arch_exit handling before the run_on_cpu > stuff which might result in this hang. (luckily it seems quite unlikely > but still we need to fix it). > We cannot simply use async as the callbacks also provide the condition > code for the initiater, so this requires some rework. > > Smells like having to provide a lock per CPU. Trylock that lock, if that's not possible, cc=busy. SIGP SET ARCHITECTURE has to lock all CPUs. That was the initital design, until I realized that this was all protected by the BQL. David -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html