On 02/16/2013 03:51 AM, Jens Kristian Søgaard wrote:
Hi Sage,
1) Decide what output format to use. We want to use something that is
I have given it some thought, and my initial suggestion to keep things
simple is to use the QCOW2 image format.
The birds eye view of the process would be as follows:
* Initial backup
User supplied information: pool, image name
Create rbd snapshot of the image named "backup_1", where 1 could be a
timestamp or an integer count.
Save the snapshot to a standard qcow2 image. Similar to:
qemu-img convert rbd:data/myimage@backup_1 -O qcow2
data_myimage_backup_1.qcow2
Note: I don't know if qemu-img actually supports reading from snapshots
currently.
It does.
* Incremental backup
User supplied information: pool, image name, path to initial backup or
previous incremental file
Create rbd snapshot of the image named "backup_2", where 2 could be a
timestamp or an integer count.
Determine previous snapshot identifier from given file name.
Determine objects changed from the snapshot given by that identifier and
the newly created snapshot.
Construct QCOW2 L1- and L2-tables in memory from that changeset.
Create new qcow2 image with the previous backup file as the backing
image, and write out the tables and changed blocks.
Delete previous rbd snapshot.
* Restoring and mounting
The use of the QCOW2 format means that we can use existing tools for
restoring and mounting the backups.
To restore a backup the user can simply choose either the initial backup
file or an incremental, and use qemu-img to copy that to a new rbd image.
To mount the initial backup or an incremental, the user can use qemu-nbd
to mount and explore the backup to determine which one to restore.
The performance of restores and mounts would ofcourse be weakened if the
backup consists of a large number of incrementals. In that case the
existing qemu-img tool could be used to flatten the backup.
* Pros/cons
The QCOW2 format support compression, so we could implement compressed
backups without much effort.
The disadvantages to using QCOW2 like this is that we do not have any
checksumming or safe guards against potential errors such as users
mixing up images.
Another disadvantage to this approach is that vital information is
stored in the actual filename of the backup file. I don't see any place
in the QCOW2 file format for storing this information inside the file,
sadly.
We could opt for storing it inside a plain text file that accompanies
the QCOW2 file, or tarballing the qcow2 file and that plain text file.
qcow2 seems like a good initial format given the existing tools. We
could always add another format later, or wrap it with extra
information like you suggest.
Have you had a chance to start implementing this yet? It'd be great to
get it working in the next month.
Josh
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html