On 28/07/2017 04:31, Longpeng (Mike) wrote: > Hi Paolo, > > On 2017/6/6 18:57, Paolo Bonzini wrote: > >> In some cases, for example involving hot-unplug of assigned >> devices, pi_post_block can forget to remove the vCPU from the >> blocked_vcpu_list. When this happens, the next call to >> pi_pre_block corrupts the list. >> >> Fix this in two ways. First, check vcpu->pre_pcpu in pi_pre_block >> and WARN instead of adding the element twice in the list. Second, >> always do the list removal in pi_post_block if vcpu->pre_pcpu is >> set (not -1). >> >> The new code keeps interrupts disabled for the whole duration of >> pi_pre_block/pi_post_block. This is not strictly necessary, but >> easier to follow. For the same reason, PI.ON is checked only >> after the cmpxchg, and to handle it we just call the post-block >> code. This removes duplication of the list removal code. >> >> Cc: Longpeng (Mike) <longpeng2@xxxxxxxxxx> >> Cc: Huangweidong <weidong.huang@xxxxxxxxxx> >> Cc: Gonglei <arei.gonglei@xxxxxxxxxx> >> Cc: wangxin <wangxinxin.wang@xxxxxxxxxx> >> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> >> --- >> arch/x86/kvm/vmx.c | 62 ++++++++++++++++++++++-------------------------------- >> 1 file changed, 25 insertions(+), 37 deletions(-) >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 747d16525b45..0f4714fe4908 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -11236,10 +11236,11 @@ static void __pi_post_block(struct kvm_vcpu *vcpu) >> struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); >> struct pi_desc old, new; >> unsigned int dest; >> - unsigned long flags; >> >> do { >> old.control = new.control = pi_desc->control; >> + WARN(old.nv != POSTED_INTR_WAKEUP_VECTOR, >> + "Wakeup handler not enabled while the VCPU is blocked\n"); >> >> dest = cpu_physical_id(vcpu->cpu); >> >> @@ -11256,14 +11257,10 @@ static void __pi_post_block(struct kvm_vcpu *vcpu) >> } while (cmpxchg(&pi_desc->control, old.control, >> new.control) != old.control); >> >> - if(vcpu->pre_pcpu != -1) { >> - spin_lock_irqsave( >> - &per_cpu(blocked_vcpu_on_cpu_lock, >> - vcpu->pre_pcpu), flags); >> + if (!WARN_ON_ONCE(vcpu->pre_pcpu == -1)) { > > > __pi_post_block is only called by pi_post_block/pi_pre_block now, it seems that > both of them would make sure "vcpu->pre_pcpu != -1" before __pi_post_block is > called, so maybe the above check is useless, right? It's because a WARN is better than a double-add. And even if the caller broke the invariant you'd have to do the cmpxchg loop above to make things not break too much. Paolo