Re: KVM call minutes for Sept 21

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 On 09/22/2010 02:04 AM, Nadav Har'El wrote:
Hi, thanks for the summary. I also listened-in on the call. I'm glad these issues are being discussed.

On Tue, Sep 21, 2010, Chris Wright wrote about "KVM call minutes for Sept 21":
>  Nested VMX
>  - looking for forward progress and better collaboration between the
>    Intel and IBM teams

I'll be very happy if anyone, be it from Intel or somewhere else, would like
to help me work on nested VMX.

Somebody (I don't recognize your voices yet, sorry...) mentioned on the call
that there might not be much point in cooperation before I finish getting
nested VMX merged into KVM. I agree, but my conclusion is different that what
I think the speaker implied: My conclusion is that it is important that we
merge the nested VMX code into KVM as soon as possible, because if nested VMX
is part of KVM (and not a set of patches which becomes stale the moment after
I release it) this will make it much easier for people to test it, use it,
and cooperate in developing it.

Don't worry, I want to merge nvmx as soon as possible (but not sooner).

>  - needs more review (not a new issue)

I think the reviews that nested VMX has received over the past year (thanks
to Avi Kivity, Gleb Natapov, Eddie Dong and sometimes others), have been
fantastic. You guys have shown deep understanding of the code, and found
numerous bugs, oversights, missing features, and also a fair share of ugly
code, and we (first Orit and Abel, and then I) have done are best to fix all
of these issues. I've personally learned a lot from the latest round of
reviews, and the discussions with you.

So I don't think there has been any lack of reviews. I don't think that
getting more reviews is the most important task ahead of us.

The code is so incredibly complex that each review round raises new issues, simply because they were hidden by other issues previously, or because the reviewer's understanding only reached the point where they can notice it recently. Usually after a few rounds the review converges. This hasn't yet happened with nvmx, because it is so complicated.

Surely, if more people review the code, more potential bugs will be spotted.
But this is always the case, with any software. I think the question now
is, what would it take to finally declare the code as "good enough to be
merged", with the understanding that even after being merged it will still be
considered an experimental feature, disabled by default and documented as
experimental. Nested SVM was also merged before it was perfect, and also
KVM itself was released before being perfect :-)

The bar is set higher than ever before.

My goal is not so much to get perfect vmx emulation, instead to get code I can understand and modify.

>  - use cases

I don't kid myself that as soon as nested VMX is available in KVM, millions
of users worldwide will flock to use it. Definitely, many KVM users will never
find a need for nested virtualization. But I do believe that there are many
use cases. We outlined some of them in our paper (to be presented in a couple
of weeks in OSDI):

   1. Hosting one of the new breed of operating systems which have a hypervisor
      as part of them. Windows 7 with XP mode is one example. Linux with KVM
      is another.

   2. Platforms with embedded hypervisors in firmware need nested virt to
      run any workload - which can itself be a hypervisor with guests.

   3. Clouds users could put in their virtual machine a hypervisor with
      sub-guests, and run multiple virtual machines on the one virtual machine
      which they get.

   4. Enable live migration of entire hypervisors with their guests - for
      load balancing, disaster recovery, and so on.

   5. Honeypots and protection against hypervisor-level rootkits

   6. Make it easier to test, demonstrate, benchmark and debug hypervisors,
      and also entire virtualization setups. An entire virtualization setup
      (hypervisor and all its guests) could be run as one virtual machine,
      allowing testing many such setups on one physical machine.

By the way, I find the question of "why do we need nested VMX" a bit odd,
seeing that KVM already supports nested virtualization (for SVM). Is it the
case that nested virtualization was found useful on AMD processors, but for
Intel processors, it isn't? Of course not :-) I think KVM should support
nested virtualization on neither architecture, or on both - and of course
I think it should be on both :-)

Don't worry, nvmx will not get rejected on those grounds. However, the lack of use cases is worrying. I haven't seen the use cases you list above used with nsvm, and no bug reports from users either.

>  - work todo
>    - merge baseline patch
>      - looks pretty good
>      - review is finding mostly small things at this point
>      - need some correctness verification (both review from Intel and testing)
>    - need a test suite
>      - test suite harness will help here
>        - a few dozen nested SVM tests are there, can follow for nested VMX
>    - nested EPT

I've been keeping track of the issues remaining from the last review, and
indeed only a few remain. Only 8 of the 24 patches have any outstanding
issue, and I'm working on those that remain, as you could see on the mailing
list in the last couple of weeks. If there's interest, I can even summarize
these remaing issues.

No need.  But, as we haven't converged yet, we may see new review items.

But since I'm working on these patches alone, I think we need to define our
priorities. Most of the outstanding review comments, while absolutely correct
(and I was amazed by the quality of the reviewer's comments), deal with
re-writing code that already works (to improve its style) or fixing relatively
rare cases. It is not clear that these issues are more important than the
other things listed in the summary above (test suite, nested EPT), but as
long as I continue to rewrite pieces of the nested VMX code, I'll never get
to those other important things.

Corner cases in emulation are less important; we can get away by having a printk() when we know we're not doing the right thing. What I don't want is silent breakages, these become impossible to debug.

What's absolutely critical is having code that is understood by more people than just you. Part of the process of increasing knowledge of the code is reading it and raising issues... nothing much we can do to change it.


To summarize, I'd love for us to define some sort of plan or roadmap on
what we (or I) need to do before we can finally merge the nested VMX code
into KVM. I would love for this roadmap to be relatively short, leaving
some of the outstanding issues to be done after the merge.

>    - optimize (reduce vmreads and vmwrites)

Before we implemented nested VMX, we also feared that the exits on vmreads and
vmwrites will kill the performance. As you can see in our paper (see preprint
in http://nadav.harel.org.il/papers/nested-osdi10.pdf), we actually showed
that this is not the case - while these extra exits do hurt performance,
in common workloads (i.e., not pathological worst-case scenarios), the
trapping vmread/vmwrite only moderately hurt performance. For example, with
kernbench the nested overhead (over single-level virtualization) was 14.5%,
which could have been reduced to 10.3% if vmread/vmwrite didn't trap.
For the SPECjbb workloads, the numbers are 7.8% vs. 6.3%. As you can see,
the numbers would be better if it weren't for the L1 vmread/vmwrites trapping,
but the difference is not huge. Certainly we can start with a version that
doesn't do anything about this issue.

Sure. But I think that for I/O intensive benchmarks the slowdown will be much bigger.

So I don't think there is any urgent need to optimize nested VMX (the L0)
or the behavior of KVM as L1. Of course, there's always a long-term desire
to continue optimizing it.

>  - has long term maintan

We have been maintaining this patch set for well over a year now, so I think
we've shown long term interest in maintaining it, even across personel
changes. In any case, it would have been much easier for us - and for other
people - to maintain this patch if it was part of KVM, and we wouldn't need
to take care of rebasing when KVM changes.



I'm worried about maintaining core vmx after nvmx is merged, not nvmx itself. There are simply many more things to consider when making a change.

wrt a roadmap, clearly the merge takes precedence, and nested ept would be a strong candidate for second effort. I don't have a lot of advice about speeding up the merge except to reduce the cycle time for patchsets, which is currently quite long.

Note that we can work on a test suite in parallel with the kvm code. Look at kvm-unit-tests.git x86/svm.c.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux