Hi, Sorry it took so slow to reply you - there are only few moments when I can poke a production server and I need to notify people in advance about that :( > Can you post kvm_stat output while slowness is happening? 'perf top' on the host? and on the guest? I took 'perf top' and first thing I saw is that while guest is on acpi_pm, it shows more or less normal amount of IRQs (under 1000/s), however when I switched back to the default (which is nohz with kvm_clock), there are 40 times (!!!) more IRQs under normal operation (about 40 000/s). When the slowdown is happening, there are a lot of _spin_lock events and a lot of messages like: "WARNING: failed to keep up with mmap data. Last read 810 msecs ago." As I told before, switching to acpi_pm does not save the day, but makes situation a lot more workable (i.e., servers recover faster from the period of slowness). During slowdowns on acpi_pm I also see "_spin_lock" Raw data follows: vmstat -5 on the host: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 131904 13952 205872 0 0 0 24 2495 9813 6 3 91 0 0 0 0 132984 13952 205872 0 0 0 47 2596 9851 5 3 91 1 1 0 0 132148 13952 205872 0 0 0 54 2644 10559 3 3 93 1 0 1 0 129084 13952 205872 0 0 0 38 3039 9752 7 3 87 2 6 0 0 126388 13952 205872 0 0 0 311 15619 9009 42 17 39 2 9 0 0 125868 13960 205872 0 0 6 86 4659 6504 98 2 0 0 8 0 0 123320 13960 205872 0 0 0 26 4682 6649 98 2 0 0 8 0 0 126252 13960 205872 0 0 0 124 4923 6776 98 2 0 0 8 0 0 125376 13960 205872 0 0 136 11 4287 5865 98 2 0 0 9 0 0 123812 13960 205872 0 0 205 51 4497 6134 98 2 0 0 8 0 0 126020 13960 205872 0 0 904 26 4483 5999 98 2 0 0 8 0 0 124052 13960 205872 0 0 15 10 4397 6200 98 2 0 0 8 0 0 125928 13960 205872 0 0 14 41 4335 5823 98 2 0 0 8 0 0 126184 13960 205872 0 0 6 14 4966 6588 98 2 0 0 8 0 0 123588 13960 205872 0 0 143 18 5234 6891 98 2 0 0 8 0 0 126640 13960 205872 0 0 6 91 5554 7334 98 2 0 0 8 0 0 123144 13960 205872 0 0 146 11 5235 7145 98 2 0 0 8 0 0 125856 13968 205872 0 0 1282 98 5481 7159 98 2 0 0 9 19 0 124124 13968 205872 0 0 782 2433 8587 8987 97 3 0 0 8 0 0 122584 13968 205872 0 0 432 90 5359 6960 98 2 0 0 8 0 0 125320 13968 205872 0 0 3074 52 5448 7095 97 3 0 0 8 0 0 121436 13968 205872 0 0 2519 81 5714 7279 98 2 0 0 8 0 0 124436 13968 205872 0 0 1 56 5242 6864 98 2 0 0 8 0 0 111324 13968 205872 0 0 2 22 10660 6686 97 3 0 0 8 0 0 107824 13968 205872 0 0 0 24 14329 8147 97 3 0 0 8 0 0 110420 13968 205872 0 0 0 68 13486 6985 98 2 0 0 8 0 0 110024 13968 205872 0 0 0 19 13085 6659 98 2 0 0 8 0 0 109932 13968 205872 0 0 0 3 12952 6415 98 2 0 0 8 0 0 108552 13968 205880 0 0 2 41 13400 7349 98 2 0 0 Few shots with kvm_stat on the host: Every 2.0s: kvm_stat -1 Wed Dec 1 04:45:47 2010 efer_reload 0 0 exits 56264102 14074 fpu_reload 311506 50 halt_exits 4733166 935 halt_wakeup 3845079 840 host_state_reload 8795964 4085 hypercalls 0 0 insn_emulation 13573212 7249 insn_emulation_fail 0 0 invlpg 1846050 20 io_exits 3579406 843 irq_exits 3038887 4879 irq_injections 5242157 3681 irq_window 124361 540 largepages 2253 0 mmio_exits 64274 20 mmu_cache_miss 664011 16 mmu_flooded 164506 1 mmu_pde_zapped 212686 8 mmu_pte_updated 729268 0 mmu_pte_write 81323616 551 mmu_recycled 277 0 mmu_shadow_zapped 652691 23 mmu_unsync 5630 8 nmi_injections 0 0 nmi_window 0 0 pf_fixed 17470658 218 pf_guest 13352205 81 remote_tlb_flush 1898930 96 request_irq 0 0 signal_exits 0 0 tlb_flush 5827433 108 Every 2.0s: kvm_stat -1 Wed Dec 1 04:47:33 2010 efer_reload 0 0 exits 58155746 18954 fpu_reload 318003 61 halt_exits 4839340 1082 halt_wakeup 3940964 984 host_state_reload 9267420 4803 hypercalls 0 0 insn_emulation 14376685 7721 insn_emulation_fail 0 0 invlpg 1855758 13 io_exits 3676471 993 irq_exits 3609310 5363 irq_injections 5648007 3922 irq_window 181397 517 largepages 2253 0 mmio_exits 65862 14 mmu_cache_miss 666017 2 mmu_flooded 164784 0 mmu_pde_zapped 213208 1 mmu_pte_updated 731301 1 mmu_pte_write 81455666 14 mmu_recycled 277 0 mmu_shadow_zapped 653910 1 mmu_unsync 5461 -2 nmi_injections 0 0 nmi_window 0 0 pf_fixed 17530360 154 pf_guest 13388143 89 remote_tlb_flush 1915787 39 request_irq 0 0 signal_exits 0 0 tlb_flush 5857307 68 Every 2.0s: kvm_stat -1 Wed Dec 1 04:47:46 2010 efer_reload 0 0 exits 58382002 14542 fpu_reload 318544 32 halt_exits 4850647 541 halt_wakeup 3951056 441 host_state_reload 9317479 2669 hypercalls 0 0 insn_emulation 14464287 5075 insn_emulation_fail 0 0 invlpg 1856481 34 io_exits 3686632 456 irq_exits 3670192 3742 irq_injections 5692201 2471 irq_window 186987 126 largepages 2253 0 mmio_exits 65981 6 mmu_cache_miss 666184 14 mmu_flooded 164819 1 mmu_pde_zapped 213264 5 mmu_pte_updated 731432 2 mmu_pte_write 81473978 563 mmu_recycled 277 0 mmu_shadow_zapped 654130 10 mmu_unsync 5410 4 nmi_injections 0 0 nmi_window 0 0 pf_fixed 17536667 653 pf_guest 13391345 300 remote_tlb_flush 1917634 120 request_irq 0 0 signal_exits 0 0 tlb_flush 5860221 205 'perf top' on the host: ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- PerfTop: 3894 irqs/sec kernel: 2.4% exact: 0.0% [1000Hz cycles], (all, 4 CPUs) ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- samples pcnt function DSO _______ _____ ________________________ ______________________________________________________________ 109.00 15.1% vmx_vcpu_run /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko 64.00 8.9% copy_user_generic_string [kernel.kallsyms] 61.00 8.4% __ticket_spin_lock [kernel.kallsyms] 47.00 6.5% vcpu_enter_guest /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 23.00 3.2% schedule [kernel.kallsyms] 18.00 2.5% kvm_read_guest /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 18.00 2.5% gfn_to_hva /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 17.00 2.4% native_write_msr_safe [kernel.kallsyms] 17.00 2.4% x86_decode_insn /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 13.00 1.8% rtl8169_interrupt /lib/modules/2.6.35-23-server/kernel/drivers/net/r8169.ko 13.00 1.8% native_read_msr_safe [kernel.kallsyms] 11.00 1.5% paging64_walk_addr /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 10.00 1.4% vmcs_writel /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko 10.00 1.4% PyEval_EvalFrameEx /usr/bin/python2.6 10.00 1.4% update_curr [kernel.kallsyms] 10.00 1.4% x86_emulate_insn /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 9.00 1.2% vmx_get_cpl /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko 9.00 1.2% __srcu_read_lock [kernel.kallsyms] 9.00 1.2% emulate_instruction /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 9.00 1.2% __memcpy [kernel.kallsyms] 9.00 1.2% handle_exception /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko 8.00 1.1% vmx_complete_interrupts /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko 8.00 1.1% native_read_tsc [kernel.kallsyms] 8.00 1.1% __vcpu_run /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm.ko 7.00 1.0% fput [kernel.kallsyms] 7.00 1.0% __switch_to [kernel.kallsyms] 6.00 0.8% vmx_cache_reg /lib/modules/2.6.35-23-server/kernel/arch/x86/kvm/kvm-intel.ko 6.00 0.8% _raw_spin_lock_irqsave [kernel.kallsyms] 'perf top' on guest: ------------------------------------------------------------------------------ PerfTop: 13547 irqs/sec kernel:99.4% [100000 cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 119287.00 - 94.8% : _spin_lock 1108.00 - 0.9% : do_page_fault 599.00 - 0.5% : _spin_unlock_irqrestore 512.00 - 0.4% : finish_task_switch 432.00 - 0.3% : clear_page_c 361.00 - 0.3% : __do_softirq 214.00 - 0.2% : native_flush_tlb 158.00 - 0.1% : native_set_pte_at 147.00 - 0.1% : flush_tlb_page 141.00 - 0.1% : retint_careful 101.00 - 0.1% : kmem_cache_alloc 95.00 - 0.1% : get_page_from_freelist 94.00 - 0.1% : unmap_vmas 82.00 - 0.1% : native_set_pmd 74.00 - 0.1% : virtnet_poll [virtio_net] WARNING: failed to keep up with mmap data. Last read 810 msecs ago. WARNING: failed to keep up with mmap data. Last read 0 msecs ago. WARNING: failed to keep up with mmap data. Last read 0 msecs ago. WARNING: failed to keep up with mmap data. Last read 0 msecs ago. 'perf top' on the guest under normal operation (nohz, clocksource=kvm_clock): ------------------------------------------------------------------------------ PerfTop: 39501 irqs/sec kernel:99.8% [100000 cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 256987.00 - 98.7% : native_safe_halt 377.00 - 0.1% : __do_softirq 332.00 - 0.1% : finish_task_switch 257.00 - 0.1% : do_page_fault 201.00 - 0.1% : _spin_unlock_irqrestore 121.00 - 0.0% : tick_nohz_stop_sched_tick 113.00 - 0.0% : pvclock_clocksource_read 106.00 - 0.0% : flush_tlb_page 99.00 - 0.0% : tick_nohz_restart_sched_tick 59.00 - 0.0% : system_call_after_swapgs 55.00 - 0.0% : fget_light 52.00 - 0.0% : kmem_cache_alloc 52.00 - 0.0% : do_sys_poll 50.00 - 0.0% : perf_poll 47.00 - 0.0% : native_flush_tlb 'perf top' on the guest under normal operation (with clocksource=acpi_pm nohz=off highres=off): ------------------------------------------------------------------------------ PerfTop: 949 irqs/sec kernel:89.3% [100000 cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 10196.00 - 92.3% : native_safe_halt 145.00 - 1.3% : clear_page_c 139.00 - 1.3% : do_page_fault 100.00 - 0.9% : acpi_pm_read 52.00 - 0.5% : flush_tlb_page 49.00 - 0.4% : finish_task_switch 32.00 - 0.3% : native_flush_tlb 27.00 - 0.2% : __do_softirq 24.00 - 0.2% : _spin_unlock_irqrestore 22.00 - 0.2% : native_set_pmd 20.00 - 0.2% : native_set_pte_at 19.00 - 0.2% : _spin_lock 18.00 - 0.2% : generic_unplug_device 17.00 - 0.2% : tick_nohz_stop_sched_tick 13.00 - 0.1% : unmap_vmas 'perf top' on the guest when slowdown is happening (with clocksource=acpi_pm nohz=off highres=off): ------------------------------------------------------------------------------ PerfTop: 966 irqs/sec kernel:94.0% [100000 cpu-clock-msecs], (all, 4 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 10799.00 - 91.2% : native_safe_halt 514.00 - 4.3% : _spin_lock 115.00 - 1.0% : clear_page_c 87.00 - 0.7% : do_page_fault 68.00 - 0.6% : __do_softirq 49.00 - 0.4% : acpi_pm_read 44.00 - 0.4% : _spin_unlock_irqrestore 39.00 - 0.3% : finish_task_switch 23.00 - 0.2% : flush_tlb_page 21.00 - 0.2% : native_flush_tlb 8.00 - 0.1% : native_set_pmd 8.00 - 0.1% : native_set_pte_at 7.00 - 0.1% : tick_nohz_restart_sched_tick 6.00 - 0.1% : tick_nohz_stop_sched_tick Thanks, Dmitry -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html