Re: What time is it kvm-clock?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 24, 2016 at 05:19:38PM -0800, Andy Lutomirski wrote:
> On Wed, Feb 24, 2016 at 3:35 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> > On Wed, Feb 24, 2016 at 09:35:44AM -0800, Peter Hornyack wrote:
> >> On Tue, Feb 23, 2016 at 7:57 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> >> > On Tue, Feb 23, 2016 at 06:31:59PM -0800, Owen Hofmann wrote:
> >> >> Specifically, what underlying source of time should be exposed through
> >> >> kvm-clock and other paravirtual ABIs like the HyperV reference tsc
> >> >> page?  Recently a couple of threads on kvm-list, along with attempts
> >> >> to produce reliable behavior from kvm-clock on our systems have
> >> >> highlighted a tension between the current implementation of kvm-clock
> >> >> and potentially diverging goals for paravirt time. Here are a few:
> >> >>
> >> >> 1) kvmclock doesn't work, help?: http://www.spinics.net/lists/kvm/msg125039.html
> >> >> 2) kvmclock: improve accuracy: http://www.spinics.net/lists/kvm/msg127215.html
> >> >> 3) KVM-clock: http://www.spinics.net/lists/kvm/msg127774.html
> >> >>
> >> >> This question is mostly in regards to kvm-clock in masterclock mode
> >> >> (with PVCLOCK_TSC_STABLE set). In this mode, is kvm-clock intended to
> >> >> expose a source of time that is more 'true' than the underlying TSC?
> >> >> For example, by passing through NTP correction from the host. For the
> >> >> current implementation, the answer seems to be... why not both? Once
> >> >> programmed, kvm-clock or the HyperV TSC page will advance with the TSC
> >> >> multiplied by the frequency specified by kvm. On the other hand,
> >> >> KVM_GET_CLOCK, KVM_SET_CLOCK, and the Windows reference counter MSR
> >> >> are measured against corrected time from the host. A guest reading its
> >> >> pvclock gets a very different result from a host KVM_GET_CLOCK if the
> >> >> guest has run long enough to for TSC to diverge from NTP time. A VMM
> >> >> using these ioctls to save and restore clock state can produce wild
> >> >> time jumps from the guest's perspective.
> >> >>
> >> >> The patches in (2) address this mismatch by plumbing updates to clock
> >> >> frequency through kvm-clock to the guest. This seems like an important
> >> >> design choice for kvm-clock, and IMO deserves at least a clear
> >> >> statement of the goals for this interface, if not some more
> >> >> discussion.
> >> >
> >> > Design goals of what interface? KVM_GET_CLOCK / KVM_SET_CLOCK?
> >> >
> >> > The interfaces have been introduced to fix a bug.
> >> >
> >> >> The (later) thread in (3) claims that synchronizing with
> >> >> host time is *not* a goal of kvm-clock.
> >> >
> >> > It is not.
> >> >
> >> >> To me, kvm-clock and the HyperV TSC page are extremely effective as
> >> >> simply a more enlightened path to the host TSC. Maintaining a
> >> >> high-performance path to the TSC in the face of updates is tricky -
> >> >> see the extended comment in pvclock_update_vm_gtod_copy, or the
> >> >> discussion on the patchset in (2). Is the cost of auditing that the
> >> >> path from host gettimeofday update -> kvm -> guest pvclock -> guest
> >> >> gettimeofday both tracks host time correctly and does not produce any
> >> >> backwards warps worth the added value, if it exists? As an
> >> >> alternative, implementing KVM_GET_CLOCK or the reference time MSR as a
> >> >> function of the last update to kvm-clock or the reference TSC page,
> >> >> respectively, sounds very straightforward.
> >> >>
> >> >> (Outside of masterclock mode, the requirement that the client
> >> >> synchronizes across cpus for montonicity smoothes over a lot of
> >> >> complexity - periodically updating kvm-clock to the current time is
> >> >> simple and works.)
> >> >>
> >> >> Regardless of my opinion, I think that a clear statement of the design
> >> >> goals for kvm-clock (and kvm's implementation of the reference TSC
> >> >> page) would be valuable.
> >> >
> >> > Documentation/virtual/kvm/timekeeping.txt
> >> >
> >>
> >> Hi Marcelo,
> >>
> >> While I appreciate all of the detail in timekeeping.txt, it is not a
> >> very good reference for what kvm-clock is or how it works. kvm-clock
> >> is only mentioned three times in different places throughout that
> >> document, and nowhere is there a very clear statement of what
> >> kvm-clock is supposed to do or how it does it.
> >>
> >> For somebody that does not already have a deep understanding of the
> >> core masterclock code, trying to understand how kvm-clock works is a
> >
> > There is no "deep understanding". There is one comment there about
> > why you can't update systemtimestamp + tsc_offset (you have to read
> > the kvmclock clock read function to understand this sentence) in
> > parallel in multiple VCPUs, and thats all masterclock is about.
> >
> > Its called "master" because there must be only one system_timestamp
> > and not multiple (therefore thats the "master" copy of system_time).
> >
> >> real challenge.
> >>
> >> Thanks,
> >> Peter
> >
> > Design goals: provide a reliable clocksource device to Linux guests
> > so they are able to cope with virtualization problems, namely:
> >
> > 1. Migration to hosts with different TSC frequency.
> > 2. Support for hosts with TSCs that are not stable (whose
> > counting frequency changes across processor frequency changes).
> >
> > How: Expose a clockdevice which counts at 1GHz to guests.
> 
> This still doesn't define how closely it is intended to track 1 GHz or
> whether NTP slew is applied.
> 
> > Evolution of masterclock scheme (bugs uncovered):
> >
> > Problem: time backwards as seen by guests.
> > Solution: Fix in guest with pvclock global variable (cmpxchg).
> 
> I thought that was only for non-masterclock.
> 
> >
> > Problem: gettimeofday() performance
> > Solution: Use masterclock scheme (update pvclock areas in sync to avoid
> > time backwards event being visible to guests, its well documented in
> > x86.c, if something is unclear please try to understand the code / ask
> > and you/we improve the documentation there).
> 
> The actual masterclock host code is long and very difficult to follow.
> 
> In 4.5-rc, the vDSO guest code is IMO short and reasonably clear.
> 
> >
> > Problem: get_kernel_ns VS TSC clock get out of sync and
> > Hyper-V complains about the difference.
> >
> > Solution: expose the NTP TSC frequency so that guests
> > apply NTP frequency correction to their kvmclock reads on TSC as well.
> >
> 
> I don't understand what you mean.
> 
> > ---
> >
> > About future: agree with Andy that kvmclock should be removed.
> > So there is a pending work item there: "verify TSC clocksource
> > is fine for exposing to guests, think about the implications for
> > management software".
> > I can write down a list of items that have been fixed
> > for kvmclock and would have to be check for tsc clocksource.
> >
> > Anyone willing to take that task ?
> >
> 
> How?
> 
> On very very new hosts (those that support TSC_ADJUST and tsc
> scaling), this should be possible.  

Exactly, TSC scaling.

> The host would ideally tell the
> guest what frequency of clock it intends to provide (ideally 1 GHz
> exactly) and the guest would use it.  I'm not sure this hardware
> exists yet.
> 
> If you enable TSC scaling like this, you may need to supply an ART
> (always running timer) adjustment to the guest in case you intend to
> pass any ART consumers through to the guest.  Of course, no one
> outside Intel has *that* hardware either (AFAIK -- maybe there are
> some prototypes floating around).
> 
> > ---
> >
> > About complaint that "its not well designed whether NTP correction
> > should be applied or not". There are two different things:
> >
> > 1) Host clock and guest clocks synchronized.
> > KVM is not responsible for that, and it can't, because
> > Linux exposes a clock which is created in software
> > and fixed by NTP.
> 
> I don't understand what you mean.
> 
> Of course the guest can run its own NTP daemon or similar adjtimex
> caller and cause the guest to stop tracking the host.  But if the host
> passed CLOCK_MONOTONIC through, then the guest would, by default,
> treat kvm-clock as an exactly 1GHz source and would then expose a
> disciplined NTP-tracking CLOCK_MONOTONIC through to its user apps even
> without an NTP client on the guest.
> 
> If integration with the POSIX clock core were provided, the guest
> would learn to consume the host's CLOCK_REALTIME as well, as long as
> the host uses the tsc as its clocksource.
> 
> >
> > 2) NTP frequency correction being applied to kvmclock.
> >
> > This only means that the frequency of the pvclock reads
> > in the guest are NTP corrected.
> 
> If the host applied NTP frequency correction to the guest, then I
> would be happy.  Some folks might want this to be optional.
> 
> The guest can do additional correction on top if it wants regardless.
> 
> --Andy

Paolo's track-TSC-offset-multiplier-from-kvmclock-updates should make
enabling masterclock for suspend/resume much simpler.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux