RFH: kvm-0.14.1 on 2.6.32.42 stuck in flash_tlb_others_ipi()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

one of our VMs regularly get stuck: the VM is completely unresponsive (no ssh, 
no serial console, no VNC). Using "gdbserver" and a remote system to debug 
the running VM, I see 3 CPUs (1,3,4) stuck in
 pgd_alloc() → spin_lock_irqsave(pgd_lock)
while the 4th CPU (2) is waiting in
 pgd_alloc() → pgd_prepopulate_pmb() →... →  flush_tlb_others_ipi()

195                     while (!cpumask_empty(to_cpumask(f->flush_cpumask)))
196                             cpu_relax();
(gdb) print f->flush_cpumask
$5 = {1}

CPU 1 is duing a do_exec() syscall, will CPU 2-4 are doing a do_fork() syscall 
according to "thread apply all backtrace".

After a "set variable f->flush_cpumask 0" from gdb the kernel continued 
dumping the trace-informations, which I attached.


Host: Debian linux-2.6.32-38-amd64 (=2.6.32.42), 8 Cores
Kvm: 0.14.1+dfsg
Guest: Debian linux-2.6.32-38-i686-bigmem, 4 CPUs

Is this a known bug and/or is a fix available?
I can gather more information from the VM if needd.

Sincerely
Philipp
-- 
Philipp Hahn           Open Source Software Engineer      hahn@xxxxxxxxxxxxx
Univention GmbH        Linux for Your Business        fon: +49 421 22 232- 0
Mary-Somerville-Str.1  D-28359 Bremen                 fax: +49 421 22 232-99
                                                   http://www.univention.de/
[261850.404057] BUG: soft lockup - CPU#1 stuck for 219010s! [sshd:2707]
[261850.404057] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.404057]
[261850.404057] Pid: 2707, comm: sshd Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.404057] EIP: 0060:[<c1293e1d>] EFLAGS: 00000292 CPU: 1
[261850.404057] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.404057] EAX: 00000292 EBX: 00000003 ECX: 00000800 EDX: 00000292
[261850.404057] ESI: 00000000 EDI: 00000001 EBP: f55f3eb4 ESP: f55f3e64
[261850.404057]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.404057] CR0: 8005003b CR2: b72a1000 CR3: 35607000 CR4: 000006f0
[261850.404057] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.404057] DR6: ffff0ff0 DR7: 00000400
[261850.404057] Call Trace:
[261850.404057]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.404057]  [<c1034908>] mm_init+0xa8/0xd4
[261850.404057]  [<c1034dd1>] ? dup_mm+0x6a/0x389
[261850.404057]  [<c10356bf>] ? copy_process+0x57a/0xf28
[261850.404057]  [<c1035a60>] ? copy_process+0x91b/0xf28
[261850.404057]  [<c10361a7>] ? do_fork+0x13a/0x2bc
[261850.404057]  [<c1006e6e>] ? sys_clone+0x21/0x27
[261850.404057]  [<c100829c>] ? syscall_call+0x7/0xb
[261850.404058] BUG: soft lockup - CPU#2 stuck for 218991s! [runsv:1236]
[261850.404058] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.404058]
[261850.404058] Pid: 1236, comm: runsv Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.404058] EIP: 0060:[<c1293e1d>] EFLAGS: 00000296 CPU: 2
[261850.404058] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.404058] EAX: 00000296 EBX: 00000003 ECX: 00000003 EDX: 00000296
[261850.404058] ESI: 00000000 EDI: 00000001 EBP: f602febc ESP: f602fe6c
[261850.404058]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.404058] CR0: 8005003b CR2: 0810acf4 CR3: 36a91000 CR4: 000006f0
[261850.404058] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.404058] DR6: ffff0ff0 DR7: 00000400
[261850.404058] Call Trace:
[261850.404058]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.404058]  [<c1034908>] mm_init+0xa8/0xd4
[261850.404058]  [<c1034dd1>] ? dup_mm+0x6a/0x389
[261850.404058]  [<c10356bf>] ? copy_process+0x57a/0xf28
[261850.404058]  [<c1035a60>] ? copy_process+0x91b/0xf28
[261850.404058]  [<c10361a7>] ? do_fork+0x13a/0x2bc
[261850.404058]  [<c10bae59>] ? do_sync_read+0x0/0x107
[261850.404058]  [<c10bb8ae>] ? vfs_read+0x7b/0xd3
[261850.404058]  [<c100dc12>] ? sys_fork+0x15/0x19
[261850.404058]  [<c100829c>] ? syscall_call+0x7/0xb
[261850.438247] BUG: soft lockup - CPU#0 stuck for 218999s! [pbuilder-update:2713]
[261850.438247] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.438247]
[261850.438247] Pid: 2713, comm: pbuilder-update Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.438247] EIP: 0060:[<c1293e1d>] EFLAGS: 00200292 CPU: 0
[261850.438247] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.438247] EAX: 00200292 EBX: 00000003 ECX: 00000000 EDX: 00200292
[261850.438247] ESI: 00000000 EDI: 00000001 EBP: f54ddf4c ESP: f54ddefc
[261850.438247]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.438247] CR0: 8005003b CR2: b73611a0 CR3: 35683000 CR4: 000006f0
[261850.438247] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.438247] DR6: ffff0ff0 DR7: 00000400
[261850.438247] Call Trace:
[261850.438247]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.438247]  [<c1292e1f>] ? _cond_resched+0x25/0x3c
[261850.438247]  [<c1292e56>] ? wait_for_common+0x20/0x100
[261850.438247]  [<c1034908>] mm_init+0xa8/0xd4
[261850.438247]  [<c10bfb14>] ? bprm_mm_init+0x14/0x163
[261850.438247]  [<c10bfffc>] ? do_execve+0xe3/0x25b
[261850.438247]  [<c1006e29>] ? sys_execve+0x23/0x47
[261850.438247]  [<c100829c>] ? syscall_call+0x7/0xb
[261850.483139] BUG: soft lockup - CPU#3 stuck for 218991s! [nrpe:1254]
[261850.483139] Modules linked in: binfmt_misc nfsd exportfs xt_tcpudp nf_conntrack_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_state iptable_mangle iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables nfs lockd fscache nfs_acl auth_rpcgss sunrpc quota_v2 quota_tree sd_mod crc_t10dif ide_generic ide_gd_mod ide_core cirrusfb 8139cp mii snd_pcm snd_timer snd soundcore snd_page_alloc i2c_piix4 psmouse i2c_core i6300esb virtio_balloon joydev serio_raw pcspkr processor evdev usbhid hid ext3 jbd dm_snapshot dm_mirror dm_region_hash dm_log dm_mod uhci_hcd ata_generic virtio_blk virtio_net ehci_hcd ata_piix thermal usbcore nls_base virtio_pci floppy thermal_sys libata button [last unloaded: scsi_wait_scan]
[261850.483139]
[261850.483139] Pid: 1254, comm: nrpe Not tainted (2.6.32-ucs52-686-bigmem #1) Bochs
[261850.483139] EIP: 0060:[<c1293e1d>] EFLAGS: 00200292 CPU: 3
[261850.483139] EIP is at _spin_unlock_irqrestore+0x9/0xf
[261850.483139] EAX: 00200292 EBX: 00000003 ECX: 00000004 EDX: 00200292
[261850.483139] ESI: 00000000 EDI: 00000001 EBP: f647feb4 ESP: f647fe64
[261850.483139]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[261850.483139] CR0: 8005003b CR2: b74ee580 CR3: 36010000 CR4: 000006f0
[261850.483139] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[261850.483139] DR6: ffff0ff0 DR7: 00000400
[261850.483139] Call Trace:
[261850.483139]  [<c102422e>] ? pgd_alloc+0x1c7/0x21a
[261850.483139]  [<c1034908>] mm_init+0xa8/0xd4
[261850.483139]  [<c1034dd1>] ? dup_mm+0x6a/0x389
[261850.483139]  [<c10356bf>] ? copy_process+0x57a/0xf28
[261850.483139]  [<c1035a60>] ? copy_process+0x91b/0xf28
[261850.483139]  [<c10361a7>] ? do_fork+0x13a/0x2bc
[261850.483139]  [<c114cae3>] ? copy_to_user+0x29/0xf8
[261850.483139]  [<c1006e6e>] ? sys_clone+0x21/0x27
[261850.483139]  [<c100829c>] ? syscall_call+0x7/0xb
[261852.541339] nfs: RPC call returned error 88

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux