Re: [PATCH 3/3] Fix TSC MSR read in nested SVM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Wed, Aug 03, 2011, Zachary Amsden wrote about "Re: [PATCH 3/3] Fix TSC MSR read in nested SVM":
> Pretty sure this breaks userspace suspend at the cost of supporting a
> not-so-reasonable hardware feature.
>...
> This is correct.  Now you properly return the correct MSR value for
> the TSC to the guest in all cases.
> 
> However, there is a BIG PROBLEM (yes, it is that bad).  Sorry I did
> not think of it before.
> 
> When the guest gets suspended and frozen by userspace, to be restarted
> later, what qemu is going to do is come along and read all of the MSRs
> as part of the saved state.  One of those happens to be the TSC MSR.

And just when I thought we were done with this bug :(

Does live migration (or suspend) actually work with nested SVM in the current
code? I certainly don't expect it to work correctly with nested VMX.

Also, I vaguely remember a discussion a while back on this mailing list about
the topic of live migration and nested virtualization, and one of the ideas
raised was that before we can migrate L1, we should force an exit from L2 to
L1, either really (with some real exit reason) or artificially (call the exit
emulation function directly). Then, during the migration we will be sure to
have all the L1 MSRs, in-memory structures, and so on, updated, and,
importantly, we will also be sure to have vmcs12 (the vmcs that L1 keeps for
L2) updated in L1's memory - so that we don't need to send even more internal
KVM state (like vmcs02) during the live migration.

> In the end, it may not be worth the hassle to support this mode of
> operation that to our current knowledge, is in fact, unused.  I would

I do agree that this doesn't sound like a useful mode of operation - but
I also don't like having deliberate mistakes in the code because they
have useful side-effects. I guess that if we can't figure out a way around
this new problem, what I can do is create a patch that:

  1. Always returns L1's TSC for the MSR (as in the original SVM code).

  2. Put a big comment above this function, about it being architecturaly
     *wrong*, but still useful (and explain why).

  3. Check for the case where users might expect the architecturally-correct
     version, not the current "wrong" version. I.e., check if L1 allows L2
     exit-less reads from TSC, using the MSR bitmap; If does, kill the guest,
     or find a way to prevent this setting.

Thanks,
Nadav.


-- 
Nadav Har'El                        |        Wednesday, Aug  3 2011, 3 Av 5771
nyh@xxxxxxxxxxxxxxxxxxx             |-----------------------------------------
Phone +972-523-790466, ICQ 13349191 |Classical music: music written by a
http://nadav.harel.org.il           |decomposing composer.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux