Hi, thanks for the summary. I also listened-in on the call. I'm glad these issues are being discussed. On Tue, Sep 21, 2010, Chris Wright wrote about "KVM call minutes for Sept 21": > Nested VMX > - looking for forward progress and better collaboration between the > Intel and IBM teams I'll be very happy if anyone, be it from Intel or somewhere else, would like to help me work on nested VMX. Somebody (I don't recognize your voices yet, sorry...) mentioned on the call that there might not be much point in cooperation before I finish getting nested VMX merged into KVM. I agree, but my conclusion is different that what I think the speaker implied: My conclusion is that it is important that we merge the nested VMX code into KVM as soon as possible, because if nested VMX is part of KVM (and not a set of patches which becomes stale the moment after I release it) this will make it much easier for people to test it, use it, and cooperate in developing it. > - needs more review (not a new issue) I think the reviews that nested VMX has received over the past year (thanks to Avi Kivity, Gleb Natapov, Eddie Dong and sometimes others), have been fantastic. You guys have shown deep understanding of the code, and found numerous bugs, oversights, missing features, and also a fair share of ugly code, and we (first Orit and Abel, and then I) have done are best to fix all of these issues. I've personally learned a lot from the latest round of reviews, and the discussions with you. So I don't think there has been any lack of reviews. I don't think that getting more reviews is the most important task ahead of us. Surely, if more people review the code, more potential bugs will be spotted. But this is always the case, with any software. I think the question now is, what would it take to finally declare the code as "good enough to be merged", with the understanding that even after being merged it will still be considered an experimental feature, disabled by default and documented as experimental. Nested SVM was also merged before it was perfect, and also KVM itself was released before being perfect :-) > - use cases I don't kid myself that as soon as nested VMX is available in KVM, millions of users worldwide will flock to use it. Definitely, many KVM users will never find a need for nested virtualization. But I do believe that there are many use cases. We outlined some of them in our paper (to be presented in a couple of weeks in OSDI): 1. Hosting one of the new breed of operating systems which have a hypervisor as part of them. Windows 7 with XP mode is one example. Linux with KVM is another. 2. Platforms with embedded hypervisors in firmware need nested virt to run any workload - which can itself be a hypervisor with guests. 3. Clouds users could put in their virtual machine a hypervisor with sub-guests, and run multiple virtual machines on the one virtual machine which they get. 4. Enable live migration of entire hypervisors with their guests - for load balancing, disaster recovery, and so on. 5. Honeypots and protection against hypervisor-level rootkits 6. Make it easier to test, demonstrate, benchmark and debug hypervisors, and also entire virtualization setups. An entire virtualization setup (hypervisor and all its guests) could be run as one virtual machine, allowing testing many such setups on one physical machine. By the way, I find the question of "why do we need nested VMX" a bit odd, seeing that KVM already supports nested virtualization (for SVM). Is it the case that nested virtualization was found useful on AMD processors, but for Intel processors, it isn't? Of course not :-) I think KVM should support nested virtualization on neither architecture, or on both - and of course I think it should be on both :-) > - work todo > - merge baseline patch > - looks pretty good > - review is finding mostly small things at this point > - need some correctness verification (both review from Intel and testing) > - need a test suite > - test suite harness will help here > - a few dozen nested SVM tests are there, can follow for nested VMX > - nested EPT I've been keeping track of the issues remaining from the last review, and indeed only a few remain. Only 8 of the 24 patches have any outstanding issue, and I'm working on those that remain, as you could see on the mailing list in the last couple of weeks. If there's interest, I can even summarize these remaing issues. But since I'm working on these patches alone, I think we need to define our priorities. Most of the outstanding review comments, while absolutely correct (and I was amazed by the quality of the reviewer's comments), deal with re-writing code that already works (to improve its style) or fixing relatively rare cases. It is not clear that these issues are more important than the other things listed in the summary above (test suite, nested EPT), but as long as I continue to rewrite pieces of the nested VMX code, I'll never get to those other important things. To summarize, I'd love for us to define some sort of plan or roadmap on what we (or I) need to do before we can finally merge the nested VMX code into KVM. I would love for this roadmap to be relatively short, leaving some of the outstanding issues to be done after the merge. > - optimize (reduce vmreads and vmwrites) Before we implemented nested VMX, we also feared that the exits on vmreads and vmwrites will kill the performance. As you can see in our paper (see preprint in http://nadav.harel.org.il/papers/nested-osdi10.pdf), we actually showed that this is not the case - while these extra exits do hurt performance, in common workloads (i.e., not pathological worst-case scenarios), the trapping vmread/vmwrite only moderately hurt performance. For example, with kernbench the nested overhead (over single-level virtualization) was 14.5%, which could have been reduced to 10.3% if vmread/vmwrite didn't trap. For the SPECjbb workloads, the numbers are 7.8% vs. 6.3%. As you can see, the numbers would be better if it weren't for the L1 vmread/vmwrites trapping, but the difference is not huge. Certainly we can start with a version that doesn't do anything about this issue. So I don't think there is any urgent need to optimize nested VMX (the L0) or the behavior of KVM as L1. Of course, there's always a long-term desire to continue optimizing it. > - has long term maintan We have been maintaining this patch set for well over a year now, so I think we've shown long term interest in maintaining it, even across personel changes. In any case, it would have been much easier for us - and for other people - to maintain this patch if it was part of KVM, and we wouldn't need to take care of rebasing when KVM changes. Thanks, Nadav. -- Nadav Har'El | Wednesday, Sep 22 2010, 14 Tishri 5771 nyh@xxxxxxxxxxxxxxxxxxx |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |Linux *is* user-friendly. Not http://nadav.harel.org.il |idiot-friendly, but user-friendly. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html