Hi Paolo, We have a reproducer now, it says that the *blocked_vcpu_on_cpu* list is corruption and double addition. Do you have any suggestion? [231298.241923] WARNING: at lib/list_debug.c:36 __list_add+0x8a/0xc0() [231298.241925] list_add double add: new=ffff881b8bc48050, prev=ffff881b8bc48050, next=ffff881fffa576f0. [231298.241926] Modules linked in: guest_kbox_ram(O) igb(OVE) mlx4_ib(OVE) ib_sa(OVE) ib_mad(OVE) mlx4_en(OVE) mlx4_core(OVE) ib_uverbs(OVE) vhost_scsi(OE) target_core_pscsi target_core_file target_core_iblock target_core_mod dm_mod kbox_pci(OVE) ib_core(OVE) ib_addr(OVE) ib_netlink(OVE) compat(OVE) ixgbe(O) ext3 mbcache jbd signo_catch(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre libcrc32c kbox(O) pmcint(O) vxlan ip6_udp_tunnel udp_tunnel sd_mod crc_t10dif crct10dif_generic sg ipmi_devintf kvm_intel(O) kvm(O) coretemp crct10dif_pclmul crct10dif_common ahci libahci mpt2sas i2c_i801 i2c_algo_bit libata dca i2c_core raid_class ptp scsi_transport_sas pps_core ipmi_si ipmi_msghandler nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vhost_net(O) tun(O) vhost(O) macvtap [231298.241986] macvlan vfio_pci irqbypass vfio_iommu_type1 vfio ip_tables [last unloaded: guest_kbox_ram] [231298.241994] CPU: 1 PID: 12431 Comm: CPU 0/KVM Tainted: G W OE ----V------- 3.10.0-327.49.58.52_13.x86_64 #1 [231298.241996] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015 [231298.241997] ffff881fa372fc60 00000000054b553c ffff881fa372fc18 ffffffff81644aaf [231298.242002] ffff881fa372fc50 ffffffff8107b1c0 ffff881b8bc48050 ffff881fffa576f0 [231298.242006] ffff881b8bc48050 000000000000a022 000000000000001b ffff881fa372fcb8 [231298.242011] ffffffff8107b25c ffffffff818a9ce8 ffff881b00000030 ffff881fa372fcc8 [231298.242015] ffff881fa372fc88 00000000054b553c 0000000000000001 ffffffff8107b205 [231298.242020] ffffffff81a38960 ffff881b8bc48050 ffff881b8bc48050 ffff881fffa576f0 [231298.242025] ffff881fa372fce0 ffffffff8131a41a ffff881b8bc48000 00000000000176e0 [231298.242029] 0000000000000292 ffff881fa372fdd0 ffffffffa10de6d0 ffff881b8bc48050 [231298.242036] ffff881fa372ffd8 ffff881bb4c70000 ffff88176dfd8048 0000000000000001 [231298.242043] ffff881fa372fe18 ffffffff81656a31 ffffffffa10d9360 ffffffffa10fc140 [231298.242048] ffff881c23580100 0000000000000000 0000000000000000 ffff881b8bc48000 [231298.242052] ffff881fa372fd88 ffffffffa10dad65 0000000000000000 ffffffffa10d9360 [231298.242057] 00000000054b553c ffff881c23580200 0000000000000000 ffff881c23580000 [231298.242061] ffff881b8bc48000 ffff881fa372fdb8 ffff881b8bc48000 ffff881fa372ffd8 [231298.242066] ffff881bb4c70000 ffff88176dfd8048 0000000000000001 ffff881fa372fe18 [231298.242070] ffffffffa05ed1e8 ffffffee7ffbfaff 00000000054b553c ffff881b8bc48000 [231298.242075] ffff883fb857b600 0000000000000000 ffff881ae737dc00 ffff881bb4c70000 [231298.242079] ffff881fa372feb0 ffffffffa05d4b31 0000000000000000 0000000000008000 [231298.242084] ffff881fa372fe70 ffffffff8112f643 000000000000ffff ffff881ae737dc38 [231298.242088] ffffffffa05d4880 0000000000000000 0000000000000000 000000000000ae80 [231298.242093] ffff881ae737dc00 00000000054b553c ffff881ae737dc00 ffff883fd2a0a500 [231298.242097] 0000000000000000 0000000000000000 0000000000000001 ffff881fa372ff28 [231298.242102] ffffffff811fd9d5 000000000000ffff ffff881ae737dc38 0000000000000000 [231298.242106] 0000000000000000 000000000000ae80 0000000000000018 ffff881ae737dc00 [231298.242111] Call Trace: [231298.242115] [<ffffffff81644aaf>] dump_stack+0x19/0x1b [231298.242118] [<ffffffff8107b1c0>] warn_slowpath_common+0x70/0xb0 [231298.242122] [<ffffffff8107b25c>] warn_slowpath_fmt+0x5c/0x80 [231298.242126] [<ffffffff8107b205>] ? warn_slowpath_fmt+0x5/0x80 [231298.242130] [<ffffffff8131a41a>] __list_add+0x8a/0xc0 [231298.242136] [<ffffffffa10de6d0>] vmx_pre_block+0xe0/0x220 [kvm_intel] [231298.242140] [<ffffffff81656a31>] ? ftrace_call+0x5/0x2f [231298.242145] [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel] [231298.242151] [<ffffffffa10dad65>] ? vmx_sync_pir_to_irr+0x5/0x30 [kvm_intel] [231298.242156] [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel] [231298.242167] [<ffffffffa05ed1e8>] kvm_arch_vcpu_ioctl_run+0x178/0x440 [kvm] [231298.242176] [<ffffffffa05d4b31>] kvm_vcpu_ioctl+0x2b1/0x640 [kvm] [231298.242180] [<ffffffff8112f643>] ? ftrace_ops_list_func+0x83/0x110 [231298.242189] [<ffffffffa05d4880>] ? vcpu_put+0x30/0x30 [kvm] [231298.242193] [<ffffffff811fd9d5>] do_vfs_ioctl+0x2e5/0x4c0 [231298.242197] [<ffffffff811fdc51>] SyS_ioctl+0xa1/0xc0 [231298.242201] [<ffffffff81654e09>] system_call_fastpath+0x16/0x1b [231298.245626] WARNING: at lib/list_debug.c:33 __list_add+0xac/0xc0() [231298.245628] list_add corruption. prev->next should be next (ffff881fffa576f0), but was dead000000100100. (prev=ffff881b8bc48050). [231298.245629] Modules linked in: guest_kbox_ram(O) igb(OVE) mlx4_ib(OVE) ib_sa(OVE) ib_mad(OVE) mlx4_en(OVE) mlx4_core(OVE) ib_uverbs(OVE) vhost_scsi(OE) target_core_pscsi target_core_file target_core_iblock target_core_mod dm_mod kbox_pci(OVE) ib_core(OVE) ib_addr(OVE) ib_netlink(OVE) compat(OVE) ixgbe(O) ext3 mbcache jbd signo_catch(O) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre libcrc32c kbox(O) pmcint(O) vxlan ip6_udp_tunnel udp_tunnel sd_mod crc_t10dif crct10dif_generic sg ipmi_devintf kvm_intel(O) kvm(O) coretemp crct10dif_pclmul crct10dif_common ahci libahci mpt2sas i2c_i801 i2c_algo_bit libata dca i2c_core raid_class ptp scsi_transport_sas pps_core ipmi_si ipmi_msghandler nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vhost_net(O) tun(O) vhost(O) macvtap [231298.245711] macvlan vfio_pci irqbypass vfio_iommu_type1 vfio ip_tables [last unloaded: guest_kbox_ram] [231298.245725] CPU: 1 PID: 12431 Comm: CPU 0/KVM Tainted: G W OE ----V------- 3.10.0-327.49.58.52_13.x86_64 #1 [231298.245729] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. CH80GPUB8/CH80GPUB8, BIOS GPUBV201 06/18/2015 [231298.245732] ffff881fa372fc60 00000000054b553c ffff881fa372fc18 ffffffff81644aaf [231298.245740] ffff881fa372fc50 ffffffff8107b1c0 ffff881b8bc48050 ffff881fffa576f0 [231298.245748] ffff881b8bc48050 000000000000a022 000000000000001b ffff881fa372fcb8 [231298.245756] ffffffff8107b25c ffffffff818a9c98 ffff881f00000030 ffff881fa372fcc8 [231298.245765] ffff881fa372fc88 00000000054b553c 0000000000000001 ffffffff8107b205 [231298.245773] ffffffff81a38960 ffff881fffa576f0 dead000000100100 ffff881b8bc48050 [231298.245781] ffff881fa372fce0 ffffffff8131a43c ffff881b8bc48000 00000000000176e0 [231298.245791] 0000000000000292 ffff881fa372fdd0 ffffffffa10de6d0 ffff881b8bc48050 [231298.245799] ffff881fa372ffd8 ffff881bb4c70000 ffff88176dfd8048 0000000000000001 [231298.245808] ffff881fa372fe18 ffffffff81656a31 ffffffffa10d9360 ffffffffa10fc140 [231298.245816] ffff881c23580100 0000000000000000 0000000000000000 ffff881b8bc48000 [231298.245826] ffff881fa372fd88 ffffffffa10dad65 0000000000000000 ffffffffa10d9360 [231298.245834] 00000000054b553c ffff881c23580200 0000000000000000 ffff881c23580000 [231298.245842] ffff881b8bc48000 ffff881fa372fdb8 ffff881b8bc48000 ffff881fa372ffd8 [231298.245847] ffff881bb4c70000 ffff88176dfd8048 0000000000000001 ffff881fa372fe18 [231298.245851] ffffffffa05ed1e8 ffffffee7ffbfaff 00000000054b553c ffff881b8bc48000 [231298.245856] ffff883fb857b600 0000000000000000 ffff881ae737dc00 ffff881bb4c70000 [231298.245861] ffff881fa372feb0 ffffffffa05d4b31 0000000000000000 0000000000008000 [231298.245866] ffff881fa372fe70 ffffffff8112f643 000000000000ffff ffff881ae737dc38 [231298.245870] ffffffffa05d4880 0000000000000000 0000000000000000 000000000000ae80 [231298.245875] ffff881ae737dc00 00000000054b553c ffff881ae737dc00 ffff883fd2a0a500 [231298.245879] 0000000000000000 0000000000000000 0000000000000001 ffff881fa372ff28 [231298.245883] ffffffff811fd9d5 000000000000ffff ffff881ae737dc38 0000000000000000 [231298.245888] 0000000000000000 000000000000ae80 0000000000000018 ffff881ae737dc00 [231298.245893] Call Trace: [231298.245898] [<ffffffff81644aaf>] dump_stack+0x19/0x1b [231298.245902] [<ffffffff8107b1c0>] warn_slowpath_common+0x70/0xb0 [231298.245906] [<ffffffff8107b25c>] warn_slowpath_fmt+0x5c/0x80 [231298.245910] [<ffffffff8107b205>] ? warn_slowpath_fmt+0x5/0x80 [231298.245913] [<ffffffff8131a43c>] __list_add+0xac/0xc0 [231298.245920] [<ffffffffa10de6d0>] vmx_pre_block+0xe0/0x220 [kvm_intel] [231298.245924] [<ffffffff81656a31>] ? ftrace_call+0x5/0x2f [231298.245930] [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel] [231298.245936] [<ffffffffa10dad65>] ? vmx_sync_pir_to_irr+0x5/0x30 [kvm_intel] [231298.245941] [<ffffffffa10d9360>] ? vmx_invpcid_supported+0x20/0x20 [kvm_intel] [231298.245953] [<ffffffffa05ed1e8>] kvm_arch_vcpu_ioctl_run+0x178/0x440 [kvm] [231298.245962] [<ffffffffa05d4b31>] kvm_vcpu_ioctl+0x2b1/0x640 [kvm] [231298.245967] [<ffffffff8112f643>] ? ftrace_ops_list_func+0x83/0x110 [231298.245976] [<ffffffffa05d4880>] ? vcpu_put+0x30/0x30 [kvm] [231298.245980] [<ffffffff811fd9d5>] do_vfs_ioctl+0x2e5/0x4c0 [231298.245985] [<ffffffff811fdc51>] SyS_ioctl+0xa1/0xc0 [231298.245989] [<ffffffff81654e09>] system_call_fastpath+0x16/0x1b On 2017/5/26 18:40, Paolo Bonzini wrote: > > > On 24/05/2017 07:04, Longpeng (Mike) wrote: >>>> it crashed at *1ec1* and %rax get a wrong value(0xdead000000100100) at *1e92*, >>>> it seems the *blocked_vcpu_on_cpu* list is corrupted, but kvm only access this >>>> list in pre_block/post_block/wakeup_handler, and these three functions seems good. >>>> >>>> kvm version is 4.4-stable. >>>> >>>> Do you have any ideas? Any suggestion would be greatly appreciated, thanks! >>>> >>> Is this only seen with posted interrupt support enabled? Booting with >>> intremap=nopost on the kernel commandline would disable it. Thanks, >> >> We tested with PI support enabled, but we not sure if it only occurs with PI >> enabled yet. > > This code should not run at all with PI disabled, since the handler is > only reachable through an IRTE. > > As you said, the list manipulation in those function is fairly simple. > If you have a reproducer, you can try running it with CONFIG_LIST_DEBUG > and see what you get. > > Thanks, > > Paolo > > . > -- Regards, Longpeng(Mike)