I've been running 3.18.25-18.el6.x86_64 + our build of xen 4.4.3-9 on one host for the last couple of weeks and have gotten several soft lockups within the last 24 hours. I am posting here first in case anyone else has experienced the same issue. Here is the first instance: sched: RT throttling activated NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0] Modules linked in: ebt_arp xen_pciback xen_gntalloc ebt_ip ebtable_filter ebtables ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables xt_physdev br_netfilter bridge stp llc ip6t_REJECT nf_reject_ipv6 nf_c onntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 xen_acpi_processor blktap xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd joydev sg 8250_fintek serio_raw gpio_ich iTCO_wdt iTCO_vendor_su pport coretemp intel_powerclamp crct10dif_pclmul crc32_pclmul crc32c_intel pcspkr i2c_i801 lpc_ich igb ptp pps_core hwmon ioatdma dca i7core_edac edac_core shpchp ext3 jbd mbcache raid10 raid1 sd_mod mptsas mptscsih mptbase scsi_transpor t_sas aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 ahci libahci mgag200 ttm drm_kms_helper dm_mirror dm_region_hash dm_log dm_mod CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.25-18.el6.x86_64 #1 Hardware name: Supermicro X8DTN+-F/X8DTN+-F, BIOS 080016 08/03/2011 task: ffffffff81c1b4c0 ti: ffffffff81c00000 task.ti: ffffffff81c00000 RIP: e030:[<ffffffffa02811d5>] [<ffffffffa02811d5>] xenvif_tx_build_gops+0xa5/0x890 [xen_netback] RSP: e02b:ffff88013f403ca8 EFLAGS: 00000206 RAX: 000000000000003c RBX: ffffc90012c28000 RCX: ffffc90012c280d0 RDX: 0000000000071ea1 RSI: 0000000000000040 RDI: ffffc90012c28000 RBP: ffff88013f403e38 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000071e65 R13: ffff88013f403e50 R14: 000000000000003c R15: 0000000000000032 FS: 00007fe942ac7980(0000) GS:ffff88013f400000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff8000007f6800 CR3: 00000000bcd4c000 CR4: 0000000000002660 Stack: ffff88013f403d30 ffff880006de6800 ffff8800adfee1c0 ffff88013f403e54 000000015786e75a 00000040a0351878 ffffc90012c2def0 000000015786e73c ffffc90012c280d0 ffffc90012c2def0 ffffffffa03516c0 ffff8800bf29e000 Call Trace: <IRQ> [<ffffffffa03516c0>] ? br_handle_frame_finish+0x3f0/0x3f0 [bridge] [<ffffffff815a2b9e>] ? __netif_receive_skb_core+0x1ee/0x640 [<ffffffff815a3017>] ? __netif_receive_skb+0x27/0x70 [<ffffffff815a326d>] ? netif_receive_skb_internal+0x2d/0x90 [<ffffffffa01d5053>] ? igb_alloc_rx_buffers+0x63/0xe0 [igb] [<ffffffffa0281a0d>] xenvif_tx_action+0x4d/0xa0 [xen_netback] [<ffffffffa02843b5>] xenvif_poll+0x35/0x68 [xen_netback] [<ffffffff815a39c2>] net_rx_action+0x112/0x2a0 [<ffffffff81076b7c>] __do_softirq+0xfc/0x2b0 [<ffffffff81076e3d>] irq_exit+0xbd/0xd0 [<ffffffff813b2cbc>] xen_evtchn_do_upcall+0x3c/0x50 [<ffffffff8167659e>] xen_do_hypervisor_callback+0x1e/0x40 <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [<ffffffff8100a830>] ? xen_safe_halt+0x10/0x20 [<ffffffff8101ec84>] ? default_idle+0x24/0xc0 [<ffffffff8101e28f>] ? arch_cpu_idle+0xf/0x20 [<ffffffff810b2276>] ? cpuidle_idle_call+0xd6/0x1d0 [<ffffffff81091312>] ? __atomic_notifier_call_chain+0x12/0x20 [<ffffffff810b24a5>] ? cpu_idle_loop+0x135/0x1e0 [<ffffffff810b256b>] ? cpu_startup_entry+0x1b/0x70 [<ffffffff810b25b0>] ? cpu_startup_entry+0x60/0x70 [<ffffffff81667c57>] ? rest_init+0x77/0x80 [<ffffffff81d774c9>] ? start_kernel+0x441/0x448 [<ffffffff81d76ea6>] ? set_init_arg+0x5d/0x5d [<ffffffff81d76603>] ? x86_64_start_reservations+0x2a/0x2c [<ffffffff81d7aae1>] ? xen_start_kernel+0x5ef/0x5f1 Code: 00 0f 87 06 07 00 00 44 8b b3 b8 00 00 00 44 03 b3 c0 00 00 00 45 29 e6 41 39 c6 44 0f 47 f0 45 85 f6 0f 84 8f 00 00 00 0f ae e8 <8b> 83 c0 00 00 00 83 e8 01 44 21 e0 48 8d 04 40 48 c1 e0 02 48 Of the remaining lockups, here is the common backtrace with the exception that there have been two instances of RIP being in net_rx_action: [<ffffffff815a39c2>] net_rx_action+0x112/0x2a0 [<ffffffff81076b7c>] __do_softirq+0xfc/0x2b0 [<ffffffff81076e3d>] irq_exit+0xbd/0xd0 [<ffffffff813b2cbc>] xen_evtchn_do_upcall+0x3c/0x50 [<ffffffff8167659e>] xen_do_hypervisor_callback+0x1e/0x40 <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 [<ffffffff8100a830>] ? xen_safe_halt+0x10/0x20 [<ffffffff8101ec84>] ? default_idle+0x24/0xc0 [<ffffffff8101e28f>] ? arch_cpu_idle+0xf/0x20 [<ffffffff810b2276>] ? cpuidle_idle_call+0xd6/0x1d0 [<ffffffff81091312>] ? __atomic_notifier_call_chain+0x12/0x20 [<ffffffff810b24a5>] ? cpu_idle_loop+0x135/0x1e0 [<ffffffff810b256b>] ? cpu_startup_entry+0x1b/0x70 [<ffffffff810b25b0>] ? cpu_startup_entry+0x60/0x70 [<ffffffff81667c57>] ? rest_init+0x77/0x80 [<ffffffff81d774c9>] ? start_kernel+0x441/0x448 [<ffffffff81d76ea6>] ? set_init_arg+0x5d/0x5d [<ffffffff81d76603>] ? x86_64_start_reservations+0x2a/0x2c [<ffffffff81d7aae1>] ? xen_start_kernel+0x5ef/0x5f1 I can post more complete backtraces if that information would be useful to someone. _______________________________________________ CentOS-virt mailing list CentOS-virt@xxxxxxxxxx https://lists.centos.org/mailman/listinfo/centos-virt