On Fri, Nov 11, 2011 at 07:56:58PM +0800, Osier Yang wrote:
Hi, all
This is a basic implementation of libvirt Native Linux KVM
Tool driver. Note that this is just made with my own interest
and spare time, it's not an endorsement/effort by Red Hat,
and it isn't supported by Red Hat officially.
Basically, the driver is designed as *stateful*, as KVM tool
doesn't maintain any info about the guest except a socket which
for its own IPC. And it's implemented by using KVM tool binary,
which is name "kvm" currently, along with cgroup controllers
"cpuacct", and "memory" support. And as one of KVM tool's
pricinple is to allow both the non-root and root user to play with.
The driver is designed to support root and non-root too, just
like QEMU does. Example of the connection URI:
virsh -c kvmtool:///system
virsh -c kvmtool:///session
virsh -c kvmtool+unix:///system
virsh -c kvmtool+unix:///session
The implementation can support more or less than 15 virsh commands
currently, including basic domain cycle operations (define/undefine,
start/destroy, suspend/resume, console, setmem, schedinfo, dumpxml,
,autostart, dominfo, etc.)
About the domain configuration:
* "kernel": must be specified as KVM tool only support boots
from the kernel currently (no integration with BIOS app yet).
* "disk": only virtio bus is supported, and device type must be 'disk'.
* "serial/console": only one console is supported, of type serial or
virtio (can extend to support multiple console as long as kvm tool
supports, libvirt already supported mutiple console, see upstream
commit 0873b688c).
* "p9fs": only support specifying the source dir, and mount tag, only
type of 'mount' is supported.
* "memballoon": only virtio is supported, and there is no way
to config the addr.
* Multiple "disk" and "p9fs" is supported.
* Graphics and network are not supported, will explain below.
Please see "[PATCH 7/8]" for an example of the domain config. (which
contains all the XMLs supported by current implementation).
The problems of Native Linux KVM Tool from libvirt p.o.v:
* Some destros package "qemu-kvm" as "kvm", also "kvm" is a long
established name for "KVM" itself, so naming the project as
"kvm" might be not a good idea. I assume it will be named
as "kvmtool" in this implementation, never mind this if you
don't like that, it can be updated easily. :-)
Yeah, naming the binary 'kvm' is just madness. I'd strongly recommend
using 'kvmtool' as the binary name to avoid confusion with existing
'kvm' binaries based on QEMU.
* It still doesn't have an official package yet, even no "make install".
means we have no way to check the dependancy and do the checking
when 'configure'. I assume it will be installed as "/usr/bin/kvmtool"
in this implementation. This is the main reason which can prevents
upstream libvirt accepting the patches I guess.
Ok, not really a problem - we do similar for the regular QEMU driver.
* Lacks of options for user's configuration, such as "-vnc", there
is no option for user to configure the properties for the "vnc",
such as the port. It hides things, doesn't provide ways to query
the properties too, this causes problems for libvirt to add the
vnc support, as vnc clients such as virt-manager, virt-viewer,
have no way to connect the guest. Even vncviewer can't.
Being able to specify a VNC port of libvirt's choosing is pretty
much mandatory to be able to support that.In addition being able
to specify the bind address is important to be able to control
security. eg to only bind to 127.0.0.1, or only to certain NICs
in a multi-NIC host.
* KVM tool manages the network completely itself (with DHCP support?),
no way to configure, except specify the modes (user|tap|none). I
have not test it yet, but it should need explicit script to setup
the network rules(e.g. NAT) for the guest access outside world.
Anyway, there is no way for libvirt to control the guest network.
If KVM tool support TAP devices, can't be do whatever we like with
that just by passing in a configured TAP device from libvir ?
* There is a gap about the domain status between KVM tool and libvirt,
it's caused by KVM tool unlink()s the guest socket when user exits
from console (both text and graphic), but libvirt still think the
guest is running.
Being able to reliably detect shutdown/exit of the KVM too is
a very important tasks, and we can't rely on waitpid/SIG_CHLD
because we want to daemonize all instances wrt libvirtd.
In the QEMU driver we keep open a socket to the monitor, and
when we see an I/O error / POLLHUP on the socket we know that
QEMU has quit.
What is this guest socket used for ? Could libvirt keep open a
connection to it ?
One other option would be to use inotify to watch for deletion
of the guest socket in the filesystem. This is sortof what we
do with the UML driver.
* KVM tool uses $HOME/.kvm_tool as the state dir, and no way to configure,
I made a small patch to allow KVM tool accept a ENV variable,
which is "KVM_STATE_DIR", it's used across the driver. I made a
simple patch against kvm tool to let the whole patches work. See
"[PATCH] kvm tools.....". As generally we want the state dir of
a driver can be "/var/run/libvirt/kvmtool/..." for root user or
"$HOME/.libvirt/kvmtool/run" for non-root user.
What does it do with the state dir ? Is that just for storing the
guest socket ?
With QEMU we chose $HOME/.libvirt/qemu or /var/run/libvirt because
there was no policy set by QEMU itself. If KVM tool has a policy
for where it stores its state, we should just use that, and not
try to force it into a libvirt specific location.
In a privileged libvirtd instace, we should aim to still have
kvmtool itself run as an unprivilegd user / group , eg 'kvmtool:kvmtool'
And we could set the home dir of that user to /var/lib/kvmtool