Avi Kivity wrote: > Richard Davies wrote: > > Below are two 'perf top' snapshots during a slow boot, which appear to > > me to support your idea of a spin-lock problem. ... > > PerfTop: 62249 irqs/sec kernel:96.9% exact: 0.0% [4000Hz cycles], (all, 16 CPUs) > > -------------------------------------------------------------------------------------------------------------------------------- > > > > 35.80% [kernel] [k] _raw_spin_lock_irqsave > > 21.64% [kernel] [k] isolate_freepages_block > > Please disable ksm, and if this function persists in the profile, reduce > some memory from the guests. > > > 5.91% [kernel] [k] yield_to > > 4.95% [kernel] [k] _raw_spin_lock > > 3.37% [kernel] [k] kvm_vcpu_on_spin > > Except for isolate_freepages_block, all functions up to here have to do > with dealing with cpu overcommit. But let's deal with them after we see > a profile with isolate_freepages_block removed. I can trigger the slow boots without KSM and they have the same profile, with _raw_spin_lock_irqsave and isolate_freepages_block at the top. I reduced to 3x 20GB 8-core VMs on a 128GB host (rather than 3x 40GB 8-core VMs), and haven't managed to get a really slow boot yet (>5 minutes). I'll post agan when I get one. In the slowest boot that I have so far (1-2 minutes), this is the perf top ouput: PerfTop: 26741 irqs/sec kernel:97.5% exact: 0.0% [4000Hz cycles], (all, 16 CPUs) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 53.94% [kernel] [k] clear_page_c 2.77% [kernel] [k] svm_vcpu_put 2.60% [kernel] [k] svm_vcpu_run 1.79% [kernel] [k] sub_preempt_count 1.56% [kernel] [k] svm_vcpu_load 1.44% [kernel] [k] __schedule 1.36% [kernel] [k] kvm_arch_vcpu_ioctl_run 1.34% [kernel] [k] resched_task 1.32% [kernel] [k] _raw_spin_lock 0.98% [kernel] [k] trace_preempt_on 0.95% [kernel] [k] get_parent_ip 0.94% [kernel] [k] yield_to 0.88% [kernel] [k] __switch_to 0.87% [kernel] [k] get_page_from_freelist 0.81% [kernel] [k] in_lock_functions 0.76% [kernel] [k] add_preempt_count 0.72% [kernel] [k] kvm_vcpu_on_spin 0.69% [kernel] [k] free_pages_prepare 0.59% [kernel] [k] find_highest_vector 0.57% [kernel] [k] rcu_note_context_switch 0.55% [kernel] [k] paging64_walk_addr_generic 0.54% [kernel] [k] __srcu_read_lock 0.49% [kernel] [k] trace_preempt_off 0.47% [kernel] [k] reschedule_interrupt 0.45% [kernel] [k] sched_clock_cpu 0.40% [kernel] [k] trace_hardirqs_on 0.38% [kernel] [k] clear_huge_page 0.37% [kernel] [k] prep_compound_page 0.32% [kernel] [k] x86_emulate_instruction 0.32% [kernel] [k] _raw_spin_lock_irq 0.31% [kernel] [k] __srcu_read_unlock 0.31% [kernel] [k] trace_hardirqs_off 0.30% [kernel] [k] pick_next_task_fair 0.29% [kernel] [k] kvm_find_cpuid_entry 0.28% [kernel] [k] x86_decode_insn 0.26% [kernel] [k] kvm_cpu_has_pending_timer 0.26% [kernel] [k] init_emulate_ctxt 0.25% [kernel] [k] kvm_vcpu_yield_to 0.24% [kernel] [k] clear_buddies 0.24% [kernel] [k] gs_change 0.23% [kernel] [k] handle_exit 0.22% qemu-kvm [.] vnc_refresh_server_surface 0.22% [kernel] [k] update_min_vruntime 0.22% [kernel] [k] gfn_to_memslot 0.22% [kernel] [k] x86_emulate_insn 0.19% [kernel] [k] kvm_sched_out 0.19% [kernel] [k] pid_task 0.18% [kernel] [k] _raw_spin_unlock 0.18% libc-2.10.1.so [.] strcmp 0.17% [kernel] [k] get_pid_task 0.17% [kernel] [k] yield_task_fair 0.17% [kernel] [k] default_send_IPI_mask_sequence_phys 0.16% [kernel] [k] __rcu_read_unlock 0.16% [kernel] [k] kvm_get_cr8 0.16% [kernel] [k] native_sched_clock 0.16% [kernel] [k] do_insn_fetch 0.15% [kernel] [k] set_next_entity 0.14% [kernel] [k] update_rq_clock 0.14% [kernel] [k] __enqueue_entity 0.14% [kernel] [k] kvm_read_guest 0.13% qemu-kvm [.] g_hash_table_lookup 0.13% [kernel] [k] rb_erase 0.12% [kernel] [k] decode_operand 0.12% libz.so.1.2.3 [.] 0x0000000000006451 0.12% [kernel] [k] update_curr 0.12% [kernel] [k] apic_update_ppr 0.12% [kernel] [k] ktime_get 5207 unprocessable samples recorded.5208 unprocessable samples recorded.5209 unprocessable samples recorded.5210 unprocessable samples recorded.5211 unprocessable samples recorded.5212 unprocessable samples recorded.5213 unprocessable samples recorded.5214 unprocessable samples recorded.5215 unprocessable samples recorded.5216 unprocessable samples recorded.5217 unprocessable samples recorded.5218 unprocessable samples recorded.5219 unprocessable samples recorded.5220 unprocessable samples recorded.5221 unprocessable samples recorded.5222 unprocessable samples recorded.5223 unprocessable samples recorded.5224 Thanks, Richard. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html