On 03/01/2010 06:30 PM, Ingo Molnar wrote:
IMO that's a bug, not a feature. There should be a lot more interaction
between kvm-qemu and KVM: for example Qemu should have a feature to install
paravirt drivers in the guest, this would be helped by living in the kernel
repo.
Not in the slightest bit.
To support automatically installing paravirt drivers in a guest, we need
to distribute an ISO containing *binary* versions of drivers. For
Windows, there's a licensing issue that I described earlier with respect
to signing. Figuring out distribution is non-trivial and is being
worked on. So far, Red Hat are the only ones actually capable of
producing signed binaries (no mere mortal can do it). For Linux
drivers, we need to be able to ship different versions of the kernel
drivers for different distribution kernels if we don't want to rely on
what they ship.
The way we've tackled this in the past is by having an awk script that
automatically converts the virtio drivers into something buildable
across kernel versions. It's incredibly difficult to maintain and we
stopped maintaining it about a year ago when virtio drivers became
common in all distro kernels. See
http://git.kernel.org/?p=virt/kvm/kvm-guest-drivers-linux.git if you're
interested.
What would make this much easier for us is if we could add all of the
#ifdef's for various kernel versions in the mainline source tree. I'm
not holding my breath for that though :-)
But once we had an ISO with binary drivers (and such a thing is
available for Windows today), it's just a matter of adding an option to
change the CDROM to the shipped ISO. This is purely within qemu and
doesn't touch kvm.ko at all.
Once the winpv driver's binary hosting is sorted out, virt-manager will
have this feature. There are zero changes required to the kvm kernel
code to support this.
- It's released together with the kernel, which gives a periodic 3 months
release frequency. Not too slow, not too fast.
qemu release range in length from 3-6 months depending on
distribution schedules. They are very regular.
The Linux kernel is released every 3 months, +- one week. Our experience is
that even 6 months would be (way) too painful for distros.
I expect that we'll eventually even out to a consistent release
schedule. For now, we're still trying to see what fits us best. The
last 3 month release was very compressed so we're trying something a
little longer this time.
- Code quality requirements are that of the kernel's. No muck allowed and
it's not hard to explain what kind of code is preferred.
Code quality is subjective. We have a different coding style.
That's somewhat of a problem when for example a KVM kernel-space developer
crosses into Qemu code and back. Two separate styles, etc. I certainly
remember a 'culture clash' when going from the kernel into Qemu and back.
Different principles, different culture. It's better to standardize.
Some would argue that having diversity of culture is a good thing that
breeds creative thinking :-)
It's annoying to switch coding styles but I don't think it's a major
problem for anyone.
- Tool breakage bisection is a breeze: there's never any mismatch between
tools/perf and the kernel counterpart. With a separate package we'd
have more complex test and bisection scenarios.
KVM has a backwards compatible ABI so there's no such thing as mismatch
between user and kernel space.
perf too is ABI compatible (between releases) - still bisection is a lot
easier because the evolution of a particular feature can be bisected back to.
Btw., KVM certainly ha ABI breakages around 2.6.16(?) when it was added, even
of released versions.
That was a one-time thing in the very early days of KVM.
Also, within a development version you sure sometimes
iterate a new ABI component, right?
It's not really happened. We introduce new ABIs very rarely. KVM has a
very defined purpose; it provides CPU virtualization. We only extend
the ABI to support new CPU features that we didn't previously support
and since these things are defined by the Intel architecture, it's
fairly easy to define the ABI properly up front.
With a time-coherent repository both
intentional and unintentional breakages and variations can be bisected back to
as they happened.
This is an unconditional advantage and i made use of it numerous times.
We used to keep the kernel code in the same repository as the userspace
code. We stopped doing that about a year ago and it's rare that we have
a circumstance where joint bisecting is required.
You should try it. I think you'll find that it's not as obvious thing to do
as you think it is.
A few years ago I looked into cleaning up Qemu, when i hacked KVM and Qemu. I
also wanted to have a 'qemu light', which is both smaller and cleaner, and
still fits to KVM. It didnt look particularly hard back then - but it's
certainly not zero amount of work.
First impressions are deceptive. My long term goal for qemu is to get
to a point where the device models live independently of the rest of
qemu. I think it's reasonable to split these devices into a modular
library that can then be used by other applications.
That would make it possible to create a kvm-specific virtualization tool
that only supported tap and linux-aio and the bare minimum numbers of
devices. It would be easy to look at and for kernel hackers to play with.
But to be honest, it would never replace qemu. Once you add a VNC/Spice
server (you need remote connectivity), support for sparse file formats
(because we can't wait forever for btrfs to solve all of our problems),
live migration and snapshotting (required ticky marks for
virtualization), a management layer, and all of the other bells and
whistles, you'll find that you did an awful lot of work to recreate what
qemu does.
Most people that have gone down this road believe that it's more
efficient to just improve qemu's quality than it is to try and replicate
it. So far, we've been pretty successful IMHO.
Regards,
Anthony Liguori
Cleanups pay - they make a piece of code both more hackable, more debuggable
and more appealing to new developers. (i suspect you have no argument with
that notion) Also note that it wasnt me who suggested that Qemu wouldnt fit
the kernel standards as-is - it was raised by others in this discussion.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html