Re: What's kvmclock's custom sched_clock for?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 07, 2016 at 04:18:11PM +0100, Radim Krcmar wrote:
> 2016-01-07 00:41-0800, Andy Lutomirski:
> > On Wed, Jan 6, 2016 at 11:18 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> >> AFAICT KVM reliably passes a monotonic TSC through to guests, even if
> >> the host suspends.  That's all that sched_clock needs, I think.
> >>
> >> So why does kvmclock have a custom sched_clock?
> 
> If the host CPU has enough features, then yes, KVM can take care of
> everything and kvmclock has no advantage over TSC, even when migrating
> to TSC with different frequency as modern CPUs support TSC offset +
> scaling in guests.
> 
> The problem is with antiques.  Guests on old CPUs need to have more
> information on top of TSC to be able to get useful system time.
> And old KVM doesn't provide good information, so we have legacy layers
> everywhere.
> 
> kvmclock in the guest can just equal to rdtsc() with modern CPUs, but we
> still want to use kvmclock wrapper, because kvmclock can provide an
> stable clock regardless of underlying TSC (in theory).
> 
> >> On a related note, KVM doesn't pass the "invariant TSC" feature
> >> through to guests on my machine even though "invtsc" is set in QEMU
> >> and the kernel host code appears to support it.  What gives?
> > 
> > I think I solved part of the puzzle.  KVM doesn't like to advertise
> > invtsc by default because that breaks migration.  (Oddly, the end
> > result seems wrong -- with migration, the TSC doesn't stop, but it's
> > not constant, and X86_FEATURE_CONSTANT_TSC is nonetheless set, but
> > whatever.)
> 
> QEMU probably missed that because X86_FEATURE_CONSTANT_TSC is a function
> of family/model.  (CONSTANT_TSC is the same as invariant TSC as KVM
> guests don't have c-states.)
> 
> >             So the scheduler clock doesn't get marked stable.
> 
> Stable sched clock is quite unrelated to TSC features.  KVMs from last
> few years should always give good enough result to allow stable sched
> clock.  We wanted realtime guests and realtime linux needs no_hz=full
> that depends on stable sched clock.  The result is huge hack.
> 
> We'd need to say that migration creates powerful gravity fields to
> faithfully migrate constant/invariant TSC, but stable sched clock
> doesn't have that strict expectations about time.

Was that supposed to be a joke? 

> > Is that it?
> > 
> > This still doesn't explain why even explicitly trying to set invtsc
> > doesn't seem to work.
> 
> Seems like a bug.  Mine cpuid is
>    0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100
> and QEMU says
>   warning: host doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8]
> 
> I'll see if it's in KVM or QEMU.  (We should only forbid migrations to
> hosts with different frequency and without guest TSC scaling.)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux