Virtual SCSI disks hangs on heavy IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I am experiencing hangs in disk IO in systems hosted inside a virtual KVM 
machine. When the virtual system disk is SCSI and when I am doing a lot of I/O 
on it, I will eventually get error messages on the console and in dmesg like 
these:

sd 0:0:0:0: [sda] ABORT operation started
sd 0:0:0:0: ABORT operation failed.
sd 0:0:0:0: [sda] DEVICE RESET operation started
sd 0:0:0:0: DEVICE RESET operation complete.
scsi target0:0:0: control msgout: c.
scsi target0:0:0: has been reset
sd 0:0:0:0: [sda] BUS RESET operation started
sym0: SCSI BUS reset detected.
sd 0:0:0:0: BUS RESET operation complete.
sym0: SCSI BUS has been reset.
sym0: unknown interrupt(s) ignored, ISTAT=0x1 DSTAT=0x80 SIST=0x0
sd 0:0:0:0: [sda] ABORT operation started
sd 0:0:0:0: ABORT operation failed.
sd 0:0:0:0: [sda] DEVICE RESET operation started
sd 0:0:0:0: DEVICE RESET operation complete.
scsi target0:0:0: control msgout: c.
scsi target0:0:0: has been reset
sd 0:0:0:0: [sda] BUS RESET operation started
sd 0:0:0:0: BUS RESET operation complete.
(this goes on for a while)

During this time, all IO operations on this disk will hang intermittently for a 
fairly long time (minutes), but will complete eventually. After the heavy IO is 
over, things will go back to normal.

Here's an excerpt of what I found in libvirt's logfile when the problem struck:
lsi_scsi: error: ORDERED queue not implemented
lsi_scsi: error: ORDERED queue not implemented
lsi_scsi: error: ORDERED queue not implemented
lsi_scsi: error: Unimplemented message 0x0c
lsi_scsi: error: ORDERED queue not implemented
lsi_scsi: error: ORDERED queue not implemented
lsi_scsi: error: ORDERED queue not implemented
(goes on for a while)

I have witnessed this problem on two rather different host machines, one beefy 
server with two Intel Xeon processors and a hardware RAID running Fedora 14, and 
on my homeserver, an AMD machine with just a single HDD running Gentoo.

On the first machine, this happened when I did "cat /dev/zero > /bigfile ; rm 
/bigfile" (to fill unused parts of the filesystem with zeroes, so the HD image 
would compress better), on the second, it happened on unpacking the portage tree 
in a virtualized Gentoo guest.

Qemu-kvm is 0.13.0 in both cases. The kernel version is 2.6.35.10-74.fc14.x86_64 
for the Fedora machine and 2.6.36-gentoo-r5 for the Gentoo machine.

In both cases, the guest was started with libvirt. The actual commandline on the 
Fedora machine was (from libvirt's logfile):

LC_ALL=C PATH=/sbin:/usr/sbin:/bin:/usr/bin HOME=/root USER=root LOGNAME=root 
QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.13 -enable-kvm -m 1024 -smp 
4,sockets=4,cores=1,threads=1 -name basicgentooimage -uuid 348fd057-e318-
da7b-9b3f-52d9ccf10b93 -nodefconfig -nodefaults -chardev 
socket,id=monitor,path=/var/lib/libvirt/qemu/basicgentooimage.monitor,server,nowait 
-mon chardev=monitor,mode=readline -rtc base=utc -no-acpi -boot d -device 
lsi,id=scsi0,bus=pci.0,addr=0x4 -drive 
file=/data/basicgentooimage.img,if=none,id=drive-scsi0-0-0,format=qcow2 -device 
scsi-disk,bus=scsi0.0,scsi-id=0,drive=drive-scsi0-0-0,id=scsi0-0-0 -drive 
file=/data/install-amd64-minimal-20101021.iso,if=none,media=cdrom,id=drive-
ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-
ide0-1-0,id=ide0-1-0 -device 
e1000,vlan=0,id=net0,mac=52:54:00:84:6d:49,bus=pci.0,addr=0x3 -net 
tap,fd=91,vlan=0,name=hostnet0 -usb -vnc 127.0.0.1:2 -k de -vga cirrus -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5

On the Gentoo machine it was:

LC_ALL=C 
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/sbin:/usr/local/sbin:/usr/local/bin\
:/usr/sbin:/usr/bin:/sbin:/bin:/opt/bin:/usr/x86_64-pc-linux-gnu/i686-pc-linux-
gnu/gcc-bin/4.4.5:/usr/x86_64-pc-linux-gnu/gcc-bin/4.4.5 HOME=/root USER=root 
QEMU_AUDIO_DRV=none /usr/bin/qemu-kvm -S -M pc-0.13 -enable-kvm -m 1024 -smp 
2,sockets=2,cores=1,threads=1 -name testnet -uuid c92c7c02-4a8b-777a-
ee3e-7d98b3108c02 -nodefconfig -nodefaults -chardev 
socket,id=monitor,path=/var/lib/libvirt/qemu/testnet.monitor,server,nowait -mon 
chardev=monitor,mode=control -rtc base=utc -boot d -device 
lsi,id=scsi0,bus=pci.0,addr=0x4 -drive file=/kvmdata/testnet-
system,if=none,id=drive-scsi0-0-0,format=qcow2 -device scsi-
disk,bus=scsi0.0,scsi-id=0,drive=drive-scsi0-0-0,id=scsi0-0-0 -drive 
file=/kvmdata/install-amd64-minimal-20110303.iso,if=none,media=cdrom,id=drive-
ide0-0-0,readonly=on,format=raw -device ide-drive,bus=ide.0,unit=0,drive=drive-
ide0-0-0,id=ide0-0-0 -netdev tap,fd=14,id=hostnet0 -device 
e1000,netdev=hostnet0,id=net0,mac=24:42:53:21:52:45,bus=pci.0,addr=0x3 -usb -vnc 
127.0.0.1:0 -k de -vga cirrus -device virtio-balloon-
pci,id=balloon0,bus=pci.0,addr=0x5

Does anybody have an idea what might cause this or what might be done about it?

Regards,

	Guido
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux