On Thu, Jan 07, 2016 at 04:18:11PM +0100, Radim Krcmar wrote: > 2016-01-07 00:41-0800, Andy Lutomirski: > > On Wed, Jan 6, 2016 at 11:18 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > >> AFAICT KVM reliably passes a monotonic TSC through to guests, even if > >> the host suspends. That's all that sched_clock needs, I think. > >> > >> So why does kvmclock have a custom sched_clock? > > If the host CPU has enough features, then yes, KVM can take care of > everything and kvmclock has no advantage over TSC, even when migrating > to TSC with different frequency as modern CPUs support TSC offset + > scaling in guests. > > The problem is with antiques. Guests on old CPUs need to have more > information on top of TSC to be able to get useful system time. > And old KVM doesn't provide good information, so we have legacy layers > everywhere. > > kvmclock in the guest can just equal to rdtsc() with modern CPUs, but we > still want to use kvmclock wrapper, because kvmclock can provide an > stable clock regardless of underlying TSC (in theory). > > >> On a related note, KVM doesn't pass the "invariant TSC" feature > >> through to guests on my machine even though "invtsc" is set in QEMU > >> and the kernel host code appears to support it. What gives? > > > > I think I solved part of the puzzle. KVM doesn't like to advertise > > invtsc by default because that breaks migration. (Oddly, the end > > result seems wrong -- with migration, the TSC doesn't stop, but it's > > not constant, and X86_FEATURE_CONSTANT_TSC is nonetheless set, but > > whatever.) > > QEMU probably missed that because X86_FEATURE_CONSTANT_TSC is a function > of family/model. (CONSTANT_TSC is the same as invariant TSC as KVM > guests don't have c-states.) > > > So the scheduler clock doesn't get marked stable. > > Stable sched clock is quite unrelated to TSC features. KVMs from last > few years should always give good enough result to allow stable sched > clock. We wanted realtime guests and realtime linux needs no_hz=full > that depends on stable sched clock. The result is huge hack. > > We'd need to say that migration creates powerful gravity fields to > faithfully migrate constant/invariant TSC, but stable sched clock > doesn't have that strict expectations about time. Was that supposed to be a joke? > > Is that it? > > > > This still doesn't explain why even explicitly trying to set invtsc > > doesn't seem to work. > > Seems like a bug. Mine cpuid is > 0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100 > and QEMU says > warning: host doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8] > > I'll see if it's in KVM or QEMU. (We should only forbid migrations to > hosts with different frequency and without guest TSC scaling.) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html