Hi, On Wed, Aug 03, 2011, Zachary Amsden wrote about "Re: [PATCH 3/3] Fix TSC MSR read in nested SVM": > Pretty sure this breaks userspace suspend at the cost of supporting a > not-so-reasonable hardware feature. >... > This is correct. Now you properly return the correct MSR value for > the TSC to the guest in all cases. > > However, there is a BIG PROBLEM (yes, it is that bad). Sorry I did > not think of it before. > > When the guest gets suspended and frozen by userspace, to be restarted > later, what qemu is going to do is come along and read all of the MSRs > as part of the saved state. One of those happens to be the TSC MSR. And just when I thought we were done with this bug :( Does live migration (or suspend) actually work with nested SVM in the current code? I certainly don't expect it to work correctly with nested VMX. Also, I vaguely remember a discussion a while back on this mailing list about the topic of live migration and nested virtualization, and one of the ideas raised was that before we can migrate L1, we should force an exit from L2 to L1, either really (with some real exit reason) or artificially (call the exit emulation function directly). Then, during the migration we will be sure to have all the L1 MSRs, in-memory structures, and so on, updated, and, importantly, we will also be sure to have vmcs12 (the vmcs that L1 keeps for L2) updated in L1's memory - so that we don't need to send even more internal KVM state (like vmcs02) during the live migration. > In the end, it may not be worth the hassle to support this mode of > operation that to our current knowledge, is in fact, unused. I would I do agree that this doesn't sound like a useful mode of operation - but I also don't like having deliberate mistakes in the code because they have useful side-effects. I guess that if we can't figure out a way around this new problem, what I can do is create a patch that: 1. Always returns L1's TSC for the MSR (as in the original SVM code). 2. Put a big comment above this function, about it being architecturaly *wrong*, but still useful (and explain why). 3. Check for the case where users might expect the architecturally-correct version, not the current "wrong" version. I.e., check if L1 allows L2 exit-less reads from TSC, using the MSR bitmap; If does, kill the guest, or find a way to prevent this setting. Thanks, Nadav. -- Nadav Har'El | Wednesday, Aug 3 2011, 3 Av 5771 nyh@xxxxxxxxxxxxxxxxxxx |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |Classical music: music written by a http://nadav.harel.org.il |decomposing composer. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html