Re: [PATCH 1/5] x86/kvm: On KVM re-enable (e.g. after suspend), update clocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mar 17, 2016 8:10 AM, "Radim Krcmar" <rkrcmar@xxxxxxxxxx> wrote:
>
> 2016-03-16 16:07-0700, Andy Lutomirski:
> > On Wed, Mar 16, 2016 at 3:59 PM, Radim Krcmar <rkrcmar@xxxxxxxxxx> wrote:
> >> 2016-03-16 15:15-0700, Andy Lutomirski:
> >>> FWIW, if you ever intend to support ART ("always running timer")
> >>> passthrough, this is going to be a giant clusterfsck.  Good luck.  I
> >>> haven't gotten a straight answer as to what hardware actually supports
> >>> that thing, so even testing isn't no easy.
> >>
> >> Hm, AR TSC would be best handled by doing nothing ... dropping the
> >> faking logic just became tempting.
>
> ART is different from what I initially thought, it's the underlying
> mechanism for invariant TSC and nothing more ...  we already forbid
> migrations when the guest knows about invariant TSC, so we could do the
> same and let ART be virtualized.  (Suspend has to be forbidden too.)

It's more than that -- it's a TSC-like clock that can be read by PCIe devices.

>
> > As it stands, ART is screwed if you adjust the VMCS's tsc offset.  But
>
> Luckily, assigning real hardware can prevent migration or suspend, so we
> won't need to adjust the offset during runtime.  TSC is a generally
> unmigratable device that just happens to live on the CPU.
>
> (It would have been better to hide TSC capability from the guest and only
>  use rdtsc for kvmclock if the guest wanted fancy features.)
>

I think that, if KVM passes through an ART-supporting NIC, it might be
rather messy to try to avoid passing through TSC as well.  But maybe a
pvclock-like structure could expose the ART-kvmclock offset and scale.

> > I think it's also screwed if you migrate to a machine with a different
> > ratio of guest TSC ticks to host ART ticks or a different offset,
> > because the host isn't going to do the rdmsr every time it tries to
> > access the ART, so passing it through might require a paravirt
> > mechanism no matter what.
>
> It's almost certain that the other host will have a different offset,
> which makes TSC unmigratable in software without even considering ART
> or frequencies.  Well, KVM already emulates different TSC frequency, so
> we could emulate ART without sinking much lower. :)
>
> > ISTM that, if KVM tries to keep the guest TSC monotonic across
> > migration, it should probably also keep it monotonic across host
> > suspend/resume.
>
> Yes, "Pausing" TSC during suspend or migration is one way of improving
> the TSC estimate.  If we want to emulate ART, then the estimate is
> noticeably lacking, because TSC and ART are defined by a simple
> equation (SDM 2015-12, 17.14.4 Invariant Time-Keeping):
>  TSC_Value = (ART_Value * CPUID.15H:EBX[31:0] )/ CPUID.15H:EAX[31:0] + K
>
> where the guest thinks that CPUID and K are constant (between events
> that the guest knows of), so we should give the best estimate of how
> many TSC cycles have passed.  (The best estimate is still lacking.)
>
> >                  After all, host suspend/resume is kind of like
> > migrating from the pre-suspend host to the post-resume host.  Maybe it
> > could even share code.
>
> Hopefully ... host suspend/resume is driven by kernel and migration is
> driven by userspace, which might complicate sharing.

Good point.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux