https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Components.md # Components This document describes the various components of the KubeVirt architecture, how they fit together, and how they compare to the traditional virtualization architecture (QEMU + libvirt). ## Traditional architecture For the comparison to make sense, let's start by reviewing the architecture used for traditional virtualization. ![libvirt architecture][Components-Libvirt] (Image taken from the "[Look into libvirt][]" presentation by Osier Yang, which is a bit old but still mostly accurate from a high-level perspective.) In particular, the `libvirtd` process runs with high privileges on the host and is responsible for managing all VMs. When asked to start a VM, the management process will * Prepare the environment by performing a number of privileged operations upfront * Set up CGroups * Set up kernel namespaces * Apply SELinux labels * Configure network devices * Open host files * ... * Start a non-privileged QEMU process in that environment ## Kubernetes To understand how KubeVirt works, it's first necessary to have some knowledge of Kubernetes. In Kubernetes, every user workload runs inside [Pods][]. The pod is the smallest unit of work that Kubernetes will schedule. Some facts about pods: * They consist of multiple containers * The containers share a network namespace * The containers have their own PID and mount namespace * The containers have their own CGroups for CPU, memory, devices and so forth. They are controlled by k8s and can’t be modified from outside. * Pods can be started with extended privileges (`CAP_NICE`, `CAP_NET_RAW`, root user, ...) * The app in the pods can drop the privileges, but the pod can not drop them (`kubectl exec` gives you a shell with the full privileges). Creating pods with elevated privileges is generally frowned upon, and depending on the policy decided by the cluster administrator it might be outright impossible. ## KubeVirt architecture Let's now discuss how KubeVirt is structured. ![KubeVirt architecture][Components-Kubevirt] The main components are: * `virt-launcher`, a copy of which runs inside each pod besides QEMU and libvirt, is the unprivileged component responsible for receiving commands from other KubeVirt components and reporting back events such as VM crashes; * `virt-handler` runs at the node level via a DaemonSet, and is the privileged component which takes care of the VM setup by reaching into the corresponding pod and modifying its namespaces; * `virt-controller` runs at the cluster level and monitors the API server so that it can react to user requests and VM events; * `virt-api`, also running at the cluster level, exposes a few additional APIs that only apply to VMs, such as the "console" and "vnc" actions. When a KubeVirt VM is started: * We request a Pod with certain privileges and resources from Kubernetes. * The kubelet (the node daemon of kubernetes) prepares the environment with the help of a container runtime. * A shim process (virt-launcher) is our main entrypoint in the pod, which starts libvirt * Once our node-daemon (virt-handler) can reach our shim process, it does privileged setup from outside. It reaches into the namespaces and modifies their content as needed. We mostly have to modify the mount and network namespaces. * Once the environment is prepared, virt-handler asks virt-launcher to start a VM via its libvirt component. More information can be found in the [KubeVirt architecture][] page. ## Comparison The two architectures are quite similar from the high-level point of view: in both cases there are a number of privileged components which take care of preparing an environment suitable for running an unprivileged QEMU process in. The difference, however, is that while libvirtd takes care of all this setup itself, in the case of KubeVirt several smaller components are involved: some of these components are privileged just as libvirtd is, but others are not, and some of the tasks are not even performed by KubeVirt itself but rather delegated to the existing Kubernetes infrastructure. ## Use of libvirtd in KubeVirt In the traditional virtualization scenario, `libvirtd` provides a number of useful features on top of those available with plain QEMU, including * support for multiple clients connecting at the same time * management of multiple VMs through a single entry point * remote API access KubeVirt interacts with libvirt under certain conditions that make the features described above irrelevant: * there's only one client talking to libvirt: `virt-handler` * libvirt is only asked to manage a single VM * client and libvirt are running in the same pod, no remote libvirt access [Components-Kubevirt]: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Components-Kubevirt.png [Components-Libvirt]: https://gitlab.com/abologna/kubevirt-and-kvm/-/blob/master/Images/Components-Libvirt.png [KubeVirt architecture]: https://github.com/kubevirt/kubevirt/blob/master/docs/architecture.md [Look into libvirt]: https://www.slideshare.net/ben_duyujie/look-into-libvirt-osier-yang [Pods]: https://kubernetes.io/docs/concepts/workloads/pods/