Hello All, I would like to get your input on a KVM feature that I am currently developing. What it does is this - it can perform full and incremental disk backups of running KVM VMs, where a backup is defined as a snapshot of the disk state of all virtual disks configured for the VM. This backup mechanism is built by modifying the qemu-kvm userland process, and works as follows: - If a VM is configured for backup, qemu-kvm maintains a dirty blocks list since the last backup. Note that this is different from the dirty blocks list currently maintained for block migration purposes in that it is persistent across VM reboots. - qemu-kvm creates a thread and listens for backup clients. - A backup client connects to qemu-kvm and initiates an incremental backup. * A snapshot of each virtual disk is created by qemu-kvm. This is as simple as saving the dirty blocks map in the snapshot structure * The dirty blocks are now transferred over to the backup client. * While this transfer is in progress, if any blocks are written by the VM, the livebackup code intercepts these writes, saves the old blocks in a qcow2 file, and then allows the write to progress. * When the transfer of all dirty blocks in the incremental backup is completed, then the snapshot is destroyed. I have considered other technologies that may be utilized to solve the same problem such as LVM snapshots. It is possible to create a new LVM partition for each virtual disk in the VM. When a VM needs to be backed up, each of these LVM partitions is snapshotted. At this point things get messy - I don't really know of a good way to identify the blocks that were modified since the last backup. Also, once these blocks are identified, we need a mechanism to transfer them over a TCP connection to the backup server. Perhaps a way to export the 'dirty blocks' map to userland and use a deamon to transfer the block. Or maybe a kernel thread capable of listening on TCP sockets and transferring the blocks over to the backup client (I don't know if this is possible). In any case, my first attempt is to implement this in the qemu-kvm userland binary. The benefit to the end user of this technology is this: Today IaaS cloud platforms such as EC2 provide you with the ability to have two types of virtual disks in VM instances 1. Ephemeral virtual disks that are lost if there is a hardware failure 2. EBS storage volumes which are costly. I think that an efficient disk backup mechanism will enable a third type of virtual disk - one that is backed up, perhaps every hour or so. So a cloud operator using KVM virtual machines can offer three types of VMS: 1. An ephemeral VM that is lost if a hardware failure happens 2. A backed up VM that can be restored from the last hourly backup 3. A fully highly-available VM running off of a NAS or SAN or some such shared storage. VMware has extensive support for backing up running Virtual Machines in their products. It is called VMware Consolidated Backup. A lot of it seems to be targeted at Windows VMs, with hooks provided into Microsoft's Volume Snapshot Service running in the guest. My proposal will also eventually need the capability to run an agent in the guest for sync'ing the filesystem, flushing database caches, etc. I am also unsure whether just sync'ing a ext3 or ext4 FS and then snapshotting is adequate for backup purposes. I want to target this feature squarely at the cloud use model, with automated backups scheduled for instances created using an EC2 or Openstack API. Please let me know if you find this feature interesting. I am looking forward to feedback on any and all aspects of this design. I would like to work with the KVM community to contribute this feature to the KVM code base. Thanks, Jagane Sundar -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html