Re: [PATCH] KVM, CPU hotplug: Avoid wraparound in pvclock_get_nsec_offset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 12, 2011 at 02:37:15PM +0100, Vasilis Liaskovitis wrote:
> Hotplugging a vCPU with kvmclock enabled can cause a guest stall/hang. When
> the stall happens, pvclock_clocksource_read() is called for the new vCPU and
> pvclock_get_nsec_offset calculates native_read_tsc() - shadow->tsc_timestamp.
> shadow->tsc_timestamp contains a value larger than native_read_tsc(), so the
> result is a very large 64-bit unsigned value. The global tsc variable 
> last_value gets updated with this, causing system stall/freeze:
> "rcu_sched_state detected stalls on CPUs/tasks ..."
> 
> The large shadow->tsc_timestamp value observed in the hanged cases is the tsc
> written into the "boot clock" on VM startup.
> Is the "boot clock" persistent in the guest? Can it get accessed by a vCPU
> other than vCPU 0, if its own hv_clock struct has not yet been registered
> or if the host has not yet updated the new hv_clock with a valid tsc_timestamp 
> in kvm_guest_time_update() ?

When a CPU is hotplugged it'll have its TSC start counting at 0. 

We should cope with that fact and fix this bug in the boot clock handling.

>From the guests perspective, shadow->tsc_timestamp should be updated to
reflect the current vcpu (which is not the case when its reading the
value from the boot clock).

That said, i am not sure what is the best path to fix this, but the
workaround below is ugly.

> 
> Fix temporarily by returning a zero offset if the delta in
> pvclock_get_nsec_offset() is negative.
> 
> Tested on 3.0.6 guest kernel. Testing this patch requires qemu-kvm from: 
> git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug
> 
> ---
>  arch/x86/kernel/pvclock.c |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/pvclock.c b/arch/x86/kernel/pvclock.c
> index 42eb330..9d31144 100644
> --- a/arch/x86/kernel/pvclock.c
> +++ b/arch/x86/kernel/pvclock.c
> @@ -43,9 +43,14 @@ void pvclock_set_flags(u8 flags)
>  
>  static u64 pvclock_get_nsec_offset(struct pvclock_shadow_time *shadow)
>  {
> -	u64 delta = native_read_tsc() - shadow->tsc_timestamp;
> -	return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
> -				   shadow->tsc_shift);
> +        u64 current_read_tsc = native_read_tsc();
> +        if (current_read_tsc > shadow->tsc_timestamp) {
> +                u64 delta = current_read_tsc - shadow->tsc_timestamp;
> +                return pvclock_scale_delta(delta, shadow->tsc_to_nsec_mul,
> +                                shadow->tsc_shift);
> +        }
> +        /* tsc value can be smaller than tsc_timestamp on a vCPU hotplug */
> +        else return 0;
>  }
>  
>  /*
> -- 
> 1.7.7.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux