2017-11-15 09:17+0800, Wanpeng Li: > Ping, :) Ah, sorry, I got distracted while learning about the hotplug mechanism. Indeed we cannot move move the callback earlier because the cpufreq driver kvm uses on crappy hardware gets set in CPUHP_AP_ONLINE_DYN, which is way too late. > 2017-11-09 10:52 GMT+08:00 Wanpeng Li <kernellwp@xxxxxxxxx>: > > From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > > > > watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185] > > CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc4+ #4 > > RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm] > > Call Trace: > > ? get_kvmclock_ns+0xa3/0x140 [kvm] > > get_time_ref_counter+0x5a/0x80 [kvm] > > kvm_hv_process_stimers+0x120/0x5f0 [kvm] > > ? kvm_hv_process_stimers+0x120/0x5f0 [kvm] > > ? preempt_schedule+0x27/0x30 > > ? ___preempt_schedule+0x16/0x18 > > kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm] > > ? kvm_arch_vcpu_load+0x47/0x230 [kvm] > > kvm_vcpu_ioctl+0x33a/0x620 [kvm] > > ? kvm_vcpu_ioctl+0x33a/0x620 [kvm] > > ? kvm_vm_ioctl_check_extension_generic+0x3b/0x40 [kvm] > > ? kvm_dev_ioctl+0x279/0x6c0 [kvm] > > do_vfs_ioctl+0xa1/0x5d0 > > ? __fget+0x73/0xa0 > > SyS_ioctl+0x79/0x90 > > entry_SYSCALL_64_fastpath+0x1e/0xa9 > > > > This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and > > cpu-hotplug stress simultaneously. __this_cpu_read(cpu_tsc_khz) returns 0 > > (set in kvmclock_cpu_down_prep()) when the pCPU is unhotplug which results > > in kvm_get_time_scale() gets into an infinite loop. > > > > This patch fixes it by treating the unhotplug pCPU as not using master clock. > > > > Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> > > Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> > > Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> > > --- > > arch/x86/kvm/x86.c | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 03869eb..d61dcce3 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -1795,10 +1795,13 @@ u64 get_kvmclock_ns(struct kvm *kvm) > > /* both __this_cpu_read() and rdtsc() should be on the same cpu */ > > get_cpu(); > > > > - kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL, > > - &hv_clock.tsc_shift, > > - &hv_clock.tsc_to_system_mul); > > - ret = __pvclock_read_cycles(&hv_clock, rdtsc()); > > + if (__this_cpu_read(cpu_tsc_khz)) { > > + kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL, Would be safer to read __this_cpu_read(cpu_tsc_khz) only once, but I think it works for now as unplug thread must be scheduled and get_cpu() prevents changes. > > + &hv_clock.tsc_shift, > > + &hv_clock.tsc_to_system_mul); > > + ret = __pvclock_read_cycles(&hv_clock, rdtsc()); > > + } else > > + ret = ktime_get_boot_ns() + ka->kvmclock_offset; Not pretty, but gets the job done ... Reviewed-by: Radim Krčmář <rkrcmar@xxxxxxxxxx>