Hi Paolo, What's your opinion about this patch? We found it just before finishing patches for the past two days. Thanks, -Gonglei > -----Original Message----- > From: kvm-owner@xxxxxxxxxxxxxxx [mailto:kvm-owner@xxxxxxxxxxxxxxx] On > Behalf Of Herongguang (Stephen) > Sent: Thursday, April 06, 2017 9:47 AM > To: Paolo Bonzini; rkrcmar@xxxxxxxxxx; afaerber@xxxxxxx; > jan.kiszka@xxxxxxxxxxx; qemu-devel@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; > wangxin (U); Huangweidong (C) > Subject: Re: [BUG/RFC] INIT IPI lost when VM starts > > > > On 2017/4/6 0:16, Paolo Bonzini wrote: > > > > On 20/03/2017 15:21, Herongguang (Stephen) wrote: > >> We encountered a problem that when a domain starts, seabios failed to > >> online a vCPU. > >> > >> After investigation, we found that the reason is in kvm-kmod, > >> KVM_APIC_INIT bit in > >> vcpu->arch.apic->pending_events was overwritten by qemu, and thus an > >> INIT IPI sent > >> to AP was lost. Qemu does this since libvirtd sends a ‘query-cpus’ qmp > >> command to qemu > >> on VM start. > >> > >> In qemu, qmp_query_cpus-> cpu_synchronize_state-> > >> kvm_cpu_synchronize_state-> > >> do_kvm_cpu_synchronize_state, qemu gets registers/vcpu_events from > >> kvm-kmod and > >> sets cpu->kvm_vcpu_dirty to true, and vcpu thread in qemu will call > >> kvm_arch_put_registers if cpu->kvm_vcpu_dirty is true, thus > >> pending_events is > >> overwritten by qemu. > >> > >> I think there is no need for qemu to set cpu->kvm_vcpu_dirty to true > >> after ‘query-cpus’, > >> and kvm-kmod should not clear KVM_APIC_INIT unconditionally. And I am > >> not sure whether > >> it is OK for qemu to set cpu->kvm_vcpu_dirty in > >> do_kvm_cpu_synchronize_state in each caller. > >> > >> What’s your opinion? > > Hi Rongguang, > > > > sorry for the late response. > > > > Where exactly is KVM_APIC_INIT dropped? kvm_get_mp_state does clear > the > > bit, but the result of the INIT is stored in mp_state. > > It's dropped in KVM_SET_VCPU_EVENTS, see below. > > > > > kvm_get_vcpu_events is called after kvm_get_mp_state; it retrieves > > KVM_APIC_INIT in events.smi.latched_init and kvm_set_vcpu_events passes > > it back. Maybe it should ignore events.smi.latched_init if not in SMM, > > but I would like to understand the exact sequence of events. > > time0: > vcpu1: > qmp_query_cpus-> cpu_synchronize_state-> kvm_cpu_synchronize_state-> > > do_kvm_cpu_synchronize_state(and set vcpu1's cpu->kvm_vcpu_dirty to > true)-> kvm_arch_get_registers(KVM_APIC_INIT bit in > vcpu->arch.apic->pending_events was not set) > > time1: > vcpu0: > send INIT-SIPI to all AP->(in vcpu 0's > context)__apic_accept_irq(KVM_APIC_INIT bit in vcpu1's > arch.apic->pending_events is set) > > time2: > vcpu1: > kvm_cpu_exec->(if cpu->kvm_vcpu_dirty is > true)kvm_arch_put_registers->kvm_put_vcpu_events(overwritten > KVM_APIC_INIT bit in vcpu->arch.apic->pending_events!) > > So it's a race between vcpu1 get/put registers with kvm/other vcpus changing > vcpu1's status/structure fields in the mean time, I am in worry of if there are > other fields may be overwritten, > sipi_vector is one. > > also see: > https://www.mail-archive.com/qemu-devel@xxxxxxxxxx/msg438675.html > > > Thanks, > > > > paolo > > > > . > > >