Re: What time is it kvm-clock?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 24, 2016 at 3:35 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
>> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
>> > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
>> >> Specifically, what underlying source of time should be exposed through
>> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
>> >> page?  Recently a couple of threads on kvm-list, along with attempts
>> >> to produce reliable behavior from kvm-clock on our systems have
>> >> highlighted a tension between the current implementation of kvm-clock
>> >> and potentially diverging goals for paravirt time. Here are a few:
>> >>
>> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
>> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
>> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
>> >>
>> >> This question is mostly in regards to kvm-clock in masterclock mode
>> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
>> >> expose a source of time that is more 'true' than the underlying TSC?
>> >> For example, by passing through NTP correction from the host. For the
>> >> current implementation, the answer seems to be... why not both? Once
>> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
>> >> multiplied by the frequency specified by kvm. On the other hand,
>> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
>> >> are measured against corrected time from the host. A guest reading its
>> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
>> >> guest has run long enough to for TSC to diverge from NTP time. A VMM
>> >> using these ioctls to save and restore clock state can produce wild
>> >> time jumps from the guest's perspective.
>> >>
>> >> The patches in (2) address this mismatch by plumbing updates to clock
>> >> frequency through kvm-clock to the guest. This seems like an important
>> >> design choice for kvm-clock, and IMO deserves at least a clear
>> >> statement of the goals for this interface, if not some more
>> >> discussion.
>> >
>> > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
>> >
>> > The interfaces have been introduced to fix a bug.
>> >
>> >> The (later) thread in (3) claims that synchronizing with
>> >> host time is *not* a goal of kvm-clock.
>> >
>> > It is not.
>> >
>> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
>> >> simply a more enlightened path to the host TSC. Maintaining a
>> >> high-performance path to the TSC in the face of updates is tricky -
>> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
>> >> discussion on the patchset in (2). Is the cost of auditing that the
>> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
>> >> gettimeofday both tracks host time correctly and does not produce any
>> >> backwards warps worth the added value, if it exists? As an
>> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
>> >> function of the last update to kvm-clock or the reference TSC page,
>> >> respectively, sounds very straightforward.
>> >>
>> >> (Outside of masterclock mode, the requirement that the client
>> >> synchronizes across cpus for montonicity smoothes over a lot of
>> >> complexity - periodically updating kvm-clock to the current time is
>> >> simple and works.)
>> >>
>> >> Regardless of my opinion, I think that a clear statement of the design
>> >> goals for kvm-clock (and kvm's implementation of the reference TSC
>> >> page) would be valuable.
>> >
>> > Documentation/virtual/kvm/timekeeping.txt
>> >
>>
>> Hi Marcelo,
>>
>> While I appreciate all of the detail in timekeeping.txt, it is not a
>> very good reference for what kvm-clock is or how it works. kvm-clock
>> is only mentioned three times in different places throughout that
>> document, and nowhere is there a very clear statement of what
>> kvm-clock is supposed to do or how it does it.
>>
>> For somebody that does not already have a deep understanding of the
>> core masterclock code, trying to understand how kvm-clock works is a
>
> There is no "deep understanding". There is one comment there about
> why you can't update systemtimestamp + tsc_offset (you have to read
> the kvmclock clock read function to understand this sentence) in
> parallel in multiple VCPUs, and thats all masterclock is about.
>
> Its called "master" because there must be only one system_timestamp
> and not multiple (therefore thats the "master" copy of system_time).
>
>> real challenge.
>>
>> Thanks,
>> Peter
>
> Design goals: provide a reliable clocksource device to Linux guests
> so they are able to cope with virtualization problems, namely:
>
> 1. Migration to hosts with different TSC frequency.
> 2. Support for hosts with TSCs that are not stable (whose
> counting frequency changes across processor frequency changes).
>
> How: Expose a clockdevice which counts at 1GHz to guests.

This still doesn't define how closely it is intended to track 1 GHz or
whether NTP slew is applied.

> Evolution of masterclock scheme (bugs uncovered):
>
> Problem: time backwards as seen by guests.
> Solution: Fix in guest with pvclock global variable (cmpxchg).

I thought that was only for non-masterclock.

>
> Problem: gettimeofday() performance
> Solution: Use masterclock scheme (update pvclock areas in sync to avoid
> time backwards event being visible to guests, its well documented in
> x86.c, if something is unclear please try to understand the code / ask
> and you/we improve the documentation there).

The actual masterclock host code is long and very difficult to follow.

In 4.5-rc, the vDSO guest code is IMO short and reasonably clear.

>
> Problem: get_kernel_ns VS TSC clock get out of sync and
> Hyper-V complains about the difference.
>
> Solution: expose the NTP TSC frequency so that guests
> apply NTP frequency correction to their kvmclock reads on TSC as well.
>

I don't understand what you mean.

> ---
>
> About future: agree with Andy that kvmclock should be removed.
> So there is a pending work item there: "verify TSC clocksource
> is fine for exposing to guests, think about the implications for
> management software".
> I can write down a list of items that have been fixed
> for kvmclock and would have to be check for tsc clocksource.
>
> Anyone willing to take that task ?
>

How?

On very very new hosts (those that support TSC_ADJUST and tsc
scaling), this should be possible.  The host would ideally tell the
guest what frequency of clock it intends to provide (ideally 1 GHz
exactly) and the guest would use it.  I'm not sure this hardware
exists yet.

If you enable TSC scaling like this, you may need to supply an ART
(always running timer) adjustment to the guest in case you intend to
pass any ART consumers through to the guest.  Of course, no one
outside Intel has *that* hardware either (AFAIK -- maybe there are
some prototypes floating around).

> ---
>
> About complaint that "its not well designed whether NTP correction
> should be applied or not". There are two different things:
>
> 1) Host clock and guest clocks synchronized.
> KVM is not responsible for that, and it can't, because
> Linux exposes a clock which is created in software
> and fixed by NTP.

I don't understand what you mean.

Of course the guest can run its own NTP daemon or similar adjtimex
caller and cause the guest to stop tracking the host.  But if the host
passed CLOCK_MONOTONIC through, then the guest would, by default,
treat kvm-clock as an exactly 1GHz source and would then expose a
disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
without an NTP client on the guest.

If integration with the POSIX clock core were provided, the guest
would learn to consume the host's CLOCK_REALTIME as well, as long as
the host uses the tsc as its clocksource.

>
> 2) NTP frequency correction being applied to kvmclock.
>
> This only means that the frequency of the pvclock reads
> in the guest are NTP corrected.

If the host applied NTP frequency correction to the guest, then I
would be happy.  Some folks might want this to be optional.

The guest can do additional correction on top if it wants regardless.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux