Re: [PATCH 4/4] KVM: x86: track actual TSC frequency from the timekeeper struct

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Tue, 16 Feb 2016 17:59:57 +0100

On 16/02/2016 15:25, Marcelo Tosatti wrote:
> On Tue, Feb 16, 2016 at 02:48:16PM +0100, Marcelo Tosatti wrote:
>> On Mon, Feb 08, 2016 at 04:18:31PM +0100, Paolo Bonzini wrote:
>>> When an NTP server is running, it may adjust the time substantially
>>> compared to the "official" frequency of the TSC.  A 12 ppm change
>>> sums up to one second per day.
>>>
>>> This already shows up if the guest compares kvmclock with e.g. the
>>> PM timer.  It shows up even more once we add support for the Hyper-V
>>> TSC page, because the guest expects it to be in sync with the time
>>> reference counter; effectively the time reference counter is just a
>>> slow path to access the same clock that is in the TSC page.
>>>
>>> Therefore, we want kvmclock to provide the host kernel's
>>> ktime_get_boot_ns() value, at least if the master clock is active.
>>> To do so, reverse-compute the host's "actual" TSC frequency from
>>> pvclock_gtod_data and return it from kvm_get_time_and_clockread.
>>
>> Paolo,
>>
>> You'd have to generate an update to the guest structures as well, 
>> to reflect the new {mult,shift} values calculated by the host. 
>> Here:
>>
>>         /* disable master clock if host does not trust, or does not
>>          * use, TSC clocksource
>>          */
>>         if (gtod->clock.vclock_mode != VCLOCK_TSC &&
>>             atomic_read(&kvm_guest_has_master_clock) != 0)
>>                 queue_work(system_long_wq, &pvclock_gtod_work);
>>
>> No?
>>
>> At first, i'm afraid this might be heavy, so it might be interesting
>> to rate limit the update operation.
>>
> 
> Paolo,
> 
> I suppose its not sufficient: 
> 
> 500ppm of 300 seconds = .0005*300 = 0.15 seconds.
> 
> Should aim at avoiding time backwards event in the following situation:
> 
> 
> T1) t1_kvmclock_read = get_nanoseconds(); 
>     /* NTP correction to kernel clock = 500ppm */
>     /* TSC correction via mult,shift = 0ppm */
> 
>     VM-exit, update kvmclock (or Hyper-V) clock data with 
>     new values 
> 
> T2) t2_kvmclock_read = get_nanoseconds();
>     /* NTP correction to kernel clock = 500ppm */
>     /* TSC correction via mult,shift = 500ppm */
> 
> 
> So should not allow the host clock (or system_timestamp) to diverge 
> from (TSC based calculation) more than the duration of the event:
> 
>     VM-exit, update kvmclock (or Hyper-V) with new data.
> 
> To avoid t2_kvmclock_read < t1_kvmclock_read

If I don't do rate limiting, that would not be a problem I think.  The
host timekeeper code should take care of updating the base timestamps
(TSC and nanoseconds) in a way that doesn't cause a clock-goes-backwards
event?  I need to check how often the timekeeper updates the parameters.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html