[BUG] Boot hang since using guest kernel 3.8.y

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm seeing a frequent KVM boot hang using qemu-kvm 1.0+noroms-0ubuntu14.7
since kernel 3.8.y.  This report is for guest kernel 3.8.6, but the
issue did also occur with kernel 3.8.2.

I know this qemu-kvm version is a bit dated, but contacting Ubuntu
people on IRC wasn't successfull.

The qemu-kvm command line is quite a whopper (as used by virt-manager):

  /usr/bin/kvm -S -M pc-1.0 -cpu core2duo,+lahf_lm,+rdtscp,+pdpe1gb,+aes,+popcnt,+sse4.2,+sse4.1,+dca,+xtpr,+cx16,+tm2,+est,+vmx,+ds_cpl,+pbe,+tm,+ht,+ss,+acpi,+ds -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name hcl-kvm-01 -uuid 09a71c4f-ea29-145e-d6ac-e22b8550aaca -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/hcl-kvm-01.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot order=dcn,menu=on -drive file=/var/lib/libvirt/isolib/asg-9.080-10.1-hei12.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive file=/var/lib/libvirt/images/hcl-kvm-01.img,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-pci,bus=pci.0,addr=0x8,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,fd=17,id=hostnet0 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:d0:56:01,bus=pci.0,addr=0x3 -netdev tap,fd=19,id=hostnet1 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:d0:56:02,bus=pci.0,addr=0x5 -netdev tap,fd=20,id=hostnet2 -device virtio-net-pci,netdev=hostnet2,id=net2,mac=52:54:00:d0:56:03,bus=pci.0,addr=0x6 -netdev tap,fd=21,id=hostnet3 -device virtio-net-pci,netdev=hostnet3,id=net3,mac=52:54:00:d0:56:04,bus=pci.0,addr=0x7 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev file,id=charserial1,path=/tmp/v9.1-ttyS0.log -device isa-serial,chardev=charserial1,id=serial1 -chardev file,id=charserial2,path=/tmp/hcl-kvm-01-console.log -device isa-serial,chardev=charserial2,id=serial2 -usb -vnc 127.0.0.1:0 -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4

Most of the time the guest is stuck at:

  smpboot: Booting Node   0, Processors  #1 OK

Enabling DEBUG in arch/x86/kernel/smpboot.c and adding some additional
debug lines I see then:

  ...
  [    0.027205] smpboot: CPU0: Intel(R) Core(TM)2 Duo CPU     T7700  @ 2.40GHz
  (fam: 06, model: 0f, stepping: 0b)
  [    0.032000] Performance Events: unsupported p6 CPU model 15 no PMU driver,
  software events only.
  [    0.032322] NMI watchdog: disabled (cpu0): hardware events not enabled
  [    0.034997] smpboot: ++++++++++++++++++++=_---CPU UP  1
  [    0.036024] SMP alternatives: lockdep: fixing up alternatives
  [    0.038078] CPU 1 irqstacks, hard=f5c74000 soft=f5c76000
  [    0.040006] smpboot: Booting Node   0, Processors  #1 OK
  [    0.041897] smpboot: Setting warm reset code and vector.
  [    0.044030] smpboot: 1.
  [    0.044898] smpboot: 2.
  [    0.045777] smpboot: 3.
  [    0.046665] smpboot: Asserting INIT
  [    0.048019] smpboot: Waiting for send to finish...
  [    0.060113] smpboot: Deasserting INIT
  [    0.061488] smpboot: Waiting for send to finish...
  [    0.063228] smpboot: #startup loops: 2
  [    0.064005] smpboot: Sending STARTUP #1
  [    0.065378] smpboot: After apic_write

The 2 vCPU guest is running at 200% CPU time then.

The full output from 'dmesg' is available here [1].  It is quite noisy,
because lockdep is enabled, as well es several other locking-related
checks.  The partial output from running 'trace-cmd' is here [2].

And if not being stuck in the smpboot it is stuck in the ata_piix
initialization most of the time:

  ...
  [    0.272000] ata_eh_release: ENTER
  [    0.272000] ata_eh_release: EXIT
  [    0.272000] ata_msleep: about to msleep(150)

Full output from 'dmesg' is here [3].  Again quite noisy, as I have
enabled libata debugging, also added some debug lines.

I can see that the first case mentioned is stuck in

 smpboot.c:wakeup_secondary_cpu_via_init():  udelay(300);

and the 2nd case is stuck in ata_msleep(150).

If I specify 'no-kvmclock' at boot the problem is gone.

Also I found this link here [4], which describes a problem similar
to the one I see.  But this case is from 2010.

I need further advise on how to debug this any further.

Any help appreciated.

 /Holger


[1] http://pastebin.com/yNSuSaWX
[2] http://pastebin.com/qEdC217c
[3] http://pastebin.com/hxrjEY7M
[4] http://www.spinics.net/lists/kvm/msg42654.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux