Hi, i have a vm which has a poor performance. E.g. top needs seconds to refresh its output. Same with netstat. The guest is hosting a MySQL DB with a webfrontend, its response is poor too. I'm looking for the culprit. Following top in the guest i get these hints: Memory is free enough, system is not swapping. System has 8GB RAM and two cpu's. Cpu 0 is struggling with a lot of software interrupts, between 50% and 80%. Cpu1 is often waiting for IO (wa), between 0% and 20%. No application is consuming much cpu time. Here is an example: top - 11:19:18 up 18:19, 11 users, load average: 1.44, 0.94, 0.66 Tasks: 95 total, 1 running, 94 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni, 20.0%id, 0.0%wa, 0.0%hi, 80.0%si, 0.0%st Cpu1 : 1.9%us, 13.8%sy, 0.0%ni, 73.8%id, 10.5%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 7995216k total, 6385176k used, 1610040k free, 177772k buffers Swap: 2104472k total, 0k used, 2104472k free, 5940884k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6470 root 16 0 12844 1464 804 S 12 0.0 2:17.13 screen 6022 root 15 0 41032 3052 2340 S 3 0.0 1:10.99 sshd 8322 root 0 -20 10460 4976 2268 S 3 0.1 19:20.38 atop 10806 root 16 0 5540 1216 880 R 0 0.0 0:00.51 top 126 root 15 0 0 0 0 S 0 0.0 0:23.33 pdflush 3531 postgres 15 0 68616 1600 792 S 0 0.0 0:41.24 postmaster The host in which the guest runs has 96GB RAM and 8 cores. It seems not to do much: top - 11:21:19 up 15 days, 15:53, 14 users, load average: 1.40, 1.39, 1.40 Tasks: 221 total, 2 running, 219 sleeping, 0 stopped, 0 zombie Cpu0 : 15.9%us, 2.7%sy, 0.0%ni, 81.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 5.0%us, 3.0%sy, 0.0%ni, 92.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 2.0%us, 0.3%sy, 0.0%ni, 97.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.3%us, 1.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 1.3%us, 0.3%sy, 0.0%ni, 98.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 96738M total, 13466M used, 83272M free, 3M buffers Swap: 2046M total, 0M used, 2046M free, 3887M cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21765 root 20 0 105m 15m 4244 S 5 0.0 0:00.15 crm 3180 root 20 0 8572m 8.0g 8392 S 3 8.4 62:25.73 qemu-kvm 8529 hacluste 10 -10 90820 14m 9400 S 0 0.0 29:52.48 cib 21329 root 20 0 9040 1364 940 R 0 0.0 0:00.16 top 28439 root 20 0 0 0 0 S 0 0.0 0:04.51 kworker/4:2 1 root 20 0 10560 828 692 S 0 0.0 0:07.67 init 2 root 20 0 0 0 0 S 0 0.0 0:00.28 kthreadd 3 root 20 0 0 0 0 S 0 0.0 3:03.23 ksoftirqd/0 6 root RT 0 0 0 0 S 0 0.0 0:05.02 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:02.82 watchdog/0 8 root RT 0 0 0 0 S 0 0.0 0:05.18 migration/1 I think the host is not the problem. The vm resides on a SAN which is attached via FC. The whole system is a two node cluster. The vm resides in a raw partition without a FS, which i read should be good for the performance. It runs on the other node slow too. Inside the vm i have logical volumes (it was a physical system i migrated to a vm). The partitions are formatted with reiserfs (The system is already some years old, at that time reiserfs was popular ...). I use iostat on the guest: This is a typical snapshot: Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util vda 0.00 3.05 0.00 2.05 0.00 20.40 19.90 0.09 44.59 31.22 6.40 dm-0 0.00 0.00 0.00 4.55 0.00 18.20 8.00 0.24 52.31 7.74 3.52 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.10 0.00 0.40 8.00 0.01 92.00 56.00 0.56 dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-5 0.00 0.00 0.00 0.35 0.00 1.40 8.00 0.03 90.29 65.71 2.30 dm-6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 vda has several partitions, one for /, one for swap, and two physical volumes for LVM. Following "man iostat", the columns await and svctm seem to be important. Man says: await The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. svctm The average service time (in milliseconds) for I/O requests that were issued to the device. It seems system is waiting a long time for IO. Although the amount of transfered data is small. I have some suspicions: - the lvm setup in the guest - some hardware - cache mode for the disk is "none". Otherwise i can't do a live migration. What do you think ? How can i find out from where the high si comes ? Network and disk are virtio devices (which should be fast): vm58820-4:~ # lsmod|grep -i virt virtio_balloon 22788 0 virtio_net 30464 0 virtio_pci 27264 0 virtio_ring 21376 1 virtio_pci virtio_blk 25224 5 virtio 22916 4 virtio_balloon,virtio_net,virtio_pci,virtio_blk That's the config of the guest: <domain type='kvm'> <name>mausdb_vm</name> <uuid>f08c2f32-fe35-137a-0e9d-fa7485d57974</uuid> <memory unit='KiB'>8198144</memory> <currentMemory unit='KiB'>8197376</currentMemory> <vcpu placement='static'>2</vcpu> <os> <type arch='x86_64' machine='pc-i440fx-1.4'>hvm</type> <boot dev='cdrom'/> <bootmenu enable='yes'/> </os> <features> <acpi/> <apic/> <pae/> </features> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-kvm</emulator> <disk type='block' device='disk'> <driver name='qemu' type='raw' cache='none'/> <source dev='/dev/vg_cluster_01/lv_cluster_01'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <controller type='usb' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'/> <controller type='ide' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <interface type='bridge'> <mac address='52:54:00:37:92:01'/> <source bridge='br0'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='cirrus' vram='9216' heads='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> <seclabel type='none'/> </domain> Host OS is SLES 11 SP4, guest os is SLES 10 SP4. Both 64bit. Thanks for any hint. Bernd -- Bernd Lentes Systemadministration institute of developmental genetics Gebäude 35.34 - Raum 208 1 HelmholtzZentrum München bernd.lentes@xxxxxxxxxxxxxxxxxxxxx phone: +49 (0)89 3187 1241 fax: +49 (0)89 3187 2294 Erst wenn man sich auf etwas festlegt kann man Unrecht haben Scott Adams 1 Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671