Re: [PATCH 3/3] Fix TSC MSR read in nested SVM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 03, 2011 at 01:34:58AM -0700, Zachary Amsden wrote:
> Caution: this requires more care.
> 
> Pretty sure this breaks userspace suspend at the cost of supporting a
> not-so-reasonable hardware feature.
> 
> On Tue, Aug 2, 2011 at 5:55 AM, Nadav Har'El <nyh@xxxxxxxxxx> wrote:
> > When the TSC MSR is read by an L2 guest (when L1 allowed this MSR to be
> > read without exit), we need to return L2's notion of the TSC, not L1's.
> >
> > The current code incorrectly returned L1 TSC, because svm_get_msr() was also
> > used in x86.c where this was assumed, but now that these places call the new
> > svm_read_l1_tsc(), the MSR read can be fixed.
> >
> > Signed-off-by: Nadav Har'El <nyh@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/svm.c |    4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> >
> > --- .before/arch/x86/kvm/svm.c  2011-08-02 15:51:02.000000000 +0300
> > +++ .after/arch/x86/kvm/svm.c   2011-08-02 15:51:02.000000000 +0300
> > @@ -2907,9 +2907,7 @@ static int svm_get_msr(struct kvm_vcpu *
> >
> >        switch (ecx) {
> >        case MSR_IA32_TSC: {
> > -               struct vmcb *vmcb = get_host_vmcb(svm);
> > -
> > -               *data = vmcb->control.tsc_offset +
> > +               *data = svm->vmcb->control.tsc_offset +
> >                        svm_scale_tsc(vcpu, native_read_tsc());
> >
> >                break;
> >
> 
> 
> This is correct.  Now you properly return the correct MSR value for
> the TSC to the guest in all cases.
> 
> However, there is a BIG PROBLEM (yes, it is that bad).  Sorry I did
> not think of it before.
> 
> When the guest gets suspended and frozen by userspace, to be restarted
> later, what qemu is going to do is come along and read all of the MSRs
> as part of the saved state.  One of those happens to be the TSC MSR.
> 
> Since you can't guarantee suspend will take place when only L1 is
> running, you may have mixed L1/L2 TSC MSRs now being returned to
> userspace.  Now, when you resume this guest, you will have mixed L1/L2
> TSC values written into the MSRs.
> 
> Those will almost certainly fail to be matched by the TSC offset
> matching code, and thus, with multiple VCPUs, you will end up with
> slightly unsynchronized TSCs, and with that, all the problems
> associated with unstable TSC come back to haunt you again.  Basically,
> all bets for stability are off.

TSC synchronization is the least of your problems if you attempt to
save/restore state a guest while any vcpu is in L2 mode.

So to keep consistency between the remaining MSRs, i agree with Nadav's
patch. Apparently SVM's original patches to return L1 TSC were aimed at
fixing the problem VMX is facing now, which is fixed by introduction
read_l1_tsc helpers.

> Apart from recording the L1 and L2 TSC through separate MSRs, there
> isn't a good way to solve this.  Walking through all the steps again,
> I'm pretty sure this is why I thought it would be better for L0
> kvm_get_msr() to return the L1 values even if L2 was running.
> 
> In the end, it may not be worth the hassle to support this mode of
> operation that to our current knowledge, is in fact, unused.  I would
> much prefer to see a bad virtualization of a known insecure and unused
> guest mode than to have a well used feature such as L1 guest suspend /
> resume continue to cause clock de-synchronization.
> 
> Zach
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux