Re: What's kvmclock's custom sched_clock for?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2016-01-07 00:41-0800, Andy Lutomirski:
> On Wed, Jan 6, 2016 at 11:18 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>> AFAICT KVM reliably passes a monotonic TSC through to guests, even if
>> the host suspends.  That's all that sched_clock needs, I think.
>>
>> So why does kvmclock have a custom sched_clock?

If the host CPU has enough features, then yes, KVM can take care of
everything and kvmclock has no advantage over TSC, even when migrating
to TSC with different frequency as modern CPUs support TSC offset +
scaling in guests.

The problem is with antiques.  Guests on old CPUs need to have more
information on top of TSC to be able to get useful system time.
And old KVM doesn't provide good information, so we have legacy layers
everywhere.

kvmclock in the guest can just equal to rdtsc() with modern CPUs, but we
still want to use kvmclock wrapper, because kvmclock can provide an
stable clock regardless of underlying TSC (in theory).

>> On a related note, KVM doesn't pass the "invariant TSC" feature
>> through to guests on my machine even though "invtsc" is set in QEMU
>> and the kernel host code appears to support it.  What gives?
> 
> I think I solved part of the puzzle.  KVM doesn't like to advertise
> invtsc by default because that breaks migration.  (Oddly, the end
> result seems wrong -- with migration, the TSC doesn't stop, but it's
> not constant, and X86_FEATURE_CONSTANT_TSC is nonetheless set, but
> whatever.)

QEMU probably missed that because X86_FEATURE_CONSTANT_TSC is a function
of family/model.  (CONSTANT_TSC is the same as invariant TSC as KVM
guests don't have c-states.)

>             So the scheduler clock doesn't get marked stable.

Stable sched clock is quite unrelated to TSC features.  KVMs from last
few years should always give good enough result to allow stable sched
clock.  We wanted realtime guests and realtime linux needs no_hz=full
that depends on stable sched clock.  The result is huge hack.

We'd need to say that migration creates powerful gravity fields to
faithfully migrate constant/invariant TSC, but stable sched clock
doesn't have that strict expectations about time.

> Is that it?
> 
> This still doesn't explain why even explicitly trying to set invtsc
> doesn't seem to work.

Seems like a bug.  Mine cpuid is
   0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100
and QEMU says
  warning: host doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8]

I'll see if it's in KVM or QEMU.  (We should only forbid migrations to
hosts with different frequency and without guest TSC scaling.)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux