Re: [PATCH 0/2] KVM: VMX: fix race condition in VT-d PI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 18/09/2017 03:56, Haozhong Zhang wrote:
> In one of our recent test which transferring data via a passthrough
> NIC with VT-d PI enabled, we encountered the following call trace on
> KVM commit 5f54c8b2d4fa:
> 
> [30521.205623] WARNING: CPU: 30 PID: 5713 at arch/x86/kvm/vmx.c:5094 vmx_deliver_posted_interrupt+0xda/0xf0 [kvm_intel]
> [30521.353638] CPU: 30 PID: 5713 Comm: CPU 10/KVM Not tainted 4.13.0-rc6+ #1
> [30521.372155] task: ffff92f3a689c000 task.stack: ffffbd3fb1f9c000
> [30521.378757] RIP: 0010:vmx_deliver_posted_interrupt+0xda/0xf0 [kvm_intel]
> [30521.386228] RSP: 0018:ffffbd3fb1f9f9c8 EFLAGS: 00010202
> [30521.392055] RAX: 00006a0000f20002 RBX: ffff92f30fcf8000 RCX: 000000000000001d
> [30521.400010] RDX: 0000000000000070 RSI: 00000000000000fd RDI: ffff92f30fcf8000
> [30521.407964] RBP: ffffbd3fb1f9f9d0 R08: 0000000000000000 R09: 0000000000000000
> [30521.415919] R10: 0000000000000000 R11: 0000000000000008 R12: ffffbd3fb1f9fab4
> [30521.423878] R13: ffff92f323d6c258 R14: ffff92f30fcf8000 R15: ffff92f321f80900
> [30521.431834] FS:  00007f1834abd700(0000) GS:ffff92f3c8f80000(0000) knlGS:0000000000000000
> [30521.440854] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [30521.447261] CR2: 00007fd5e9f20710 CR3: 0000017727f07000 CR4: 00000000007426e0
> [30521.455217] PKRU: 55555554
> [30521.458237] Call Trace:
> [30521.461104]  __apic_accept_irq+0x26e/0x320 [kvm]
> [30521.466263]  kvm_irq_delivery_to_apic_fast+0x108/0x3e0 [kvm]
> [30521.472589]  ? smp_call_function_single+0xbf/0xf0
> [30521.477846]  kvm_irq_delivery_to_apic+0x63/0x2b0 [kvm]
> [30521.483577]  ? __update_load_avg_se.isra.28+0x142/0x160
> [30521.489397]  ? __update_load_avg_se.isra.28+0x142/0x160
> [30521.495224]  kvm_lapic_reg_write+0x11f/0x650 [kvm]
> [30521.500576]  kvm_x2apic_msr_write+0x54/0x80 [kvm]
> [30521.505833]  kvm_set_msr_common+0x3ce/0xc50 [kvm]
> [30521.511080]  ? __pi_post_block+0x13d/0x1c0 [kvm_intel]
> [30521.516802]  vmx_set_msr+0xb5/0x860 [kvm_intel]
> [30521.521860]  kvm_set_msr+0x71/0x100 [kvm]
> [30521.526334]  handle_wrmsr+0x58/0x150 [kvm_intel]
> [30521.531485]  vmx_handle_exit+0xa4/0x1490 [kvm_intel]
> [30521.537024]  ? vmx_vcpu_run+0x31b/0x4d0 [kvm_intel]
> [30521.542474]  kvm_arch_vcpu_ioctl_run+0xa38/0x1610 [kvm]
> [30521.548299]  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
> [30521.553652]  kvm_vcpu_ioctl+0x33a/0x600 [kvm]
> [30521.558511]  ? kvm_vcpu_ioctl+0x33a/0x600 [kvm]
> [30521.563568]  ? eventfd_read+0x4b/0x90
> [30521.567657]  do_vfs_ioctl+0xa1/0x5d0
> [30521.571641]  ? SyS_futex+0x7f/0x180
> [30521.575529]  SyS_ioctl+0x79/0x90
> [30521.579134]  entry_SYSCALL_64_fastpath+0x1a/0xa5
> [30521.584278] RIP: 0033:0x7f1849347787
> [30521.588261] RSP: 002b:00007f1834aba5e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [30521.596695] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1849347787
> [30521.604643] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000001e
> [30521.612594] RBP: 0000000000000000 R08: 00005637a53936d0 R09: 00000000ffffffff
> [30521.620542] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> [30521.628491] R13: 0000000000000000 R14: 00007f186701a000 R15: 00005637a88fc000
> [30521.636441] Code: c2 c1 e8 06 83 e2 3f 48 c1 e0 03 48 c1 e2 0a 48 8d ba e0 1c a1 88 48 29 c7 48 8b 05 01 29 e9 c3 ff 90 a8 00 00 00 e9 59 ff ff ff <0f> ff eb c8 0f ff eb 88 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00
> 
> 
> After investigation, we found warnings in kvm_vcpu_trigger_posted_interrupt()
> (called in vmx_deliver_posted_interrupt())
> 
>         if (vcpu->mode == IN_GUEST_MODE) {
>                 ...
>                  * If the vcpu is in guest mode, it means it is
>                  * running instead of being scheduled out and
>                  * waiting in the run queue, and that's the only
>                  * case when 'SN' is set currently, warning if
>                  * 'SN' is set.
>                  */
>                 WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc));
>                 ...
>         }
> 
> as well as in pi_pre_block()
> 
>         do {
>                 WARN((pi_desc->sn == 1),
>                      "Warning: SN field of posted-interrupts "
>                      "is set before blocking\n");
>                      
>                 ...
>         } while (!pi_try_cmpxchg_control(pi_desc, &old, &new));
> 
> assumes that if a vCPU is in guest mode, the SN field of its VT-d PI
> descriptor should not be set. However, vmx_update_pi_irte() changes
> the SN field without checking the mode of the corresponding vCPU,
> which may break the above assumption and trigger the above call trace.
> 
> Patch 1 removes the unnecessary changes of the SN field in
> vmx_update_pi_irte() to avoid the race condition.
> 
> Even though patch 1 removes the raced change, WARN_ON_ONCE() in
> vmx_deliver_posted_interrupt() is still too strong and can be
> triggered in allowed cases. Patch 2 documents the those cases and
> removes that WARN_ON_ONCE().
> 
> 
> Haozhong Zhang (2):
>   KVM: VMX: do not change SN bit in vmx_update_pi_irte()
>   KVM: VMX: remove WARN_ON_ONCE in kvm_vcpu_trigger_posted_interrupt
> 
>  arch/x86/kvm/vmx.c | 39 ++++++++++++++++++++++-----------------
>  1 file changed, 22 insertions(+), 17 deletions(-)
> 


Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux