Hi list,
I'm running libvirt qemu guests on RBD, and currently taking backups by
issuing a domfsfreeze, taking a snapshot, and then issuing a domfsthaw.
This seems to be a common approach.
This is safe, but it's impactful: the guest has frozen I/O for the
duration of the snapshot. This is usually only a few seconds.
Unfortunately, the freeze action doesn't seem to be very reliable.
Sometimes it times out, leaving the guest in a messy situation with
frozen I/O (thaw times out too when this happens, or returns success but
FSes end up frozen anyway). This is clearly a bug somewhere, but I
wonder whether the freeze is a hard requirement or not.
Are there any atomicity guarantees for RBD snapshots taken *without*
freezing the filesystem? Obviously the filesystem will be dirty and will
require journal recovery, but that is okay; it's equivalent to a hard
shutdown/crash. But is there any chance of corruption related to the
snapshot being taken in a non-atomic fashion? Filesystems and
applications these days should have no trouble with hard shutdowns, as
long as storage writes follow ordering guarantees (no writes getting
reordered across a barrier and such).
Put another way: do RBD snapshots have ~identical atomicity guarantees
to e.g. LVM snapshots?
If we can get away without the freeze, honestly I'd rather go that
route. If I really need to pause I/O during the snapshot creation, I
might end up resorting to pausing the whole VM (suspend/resume), which
has higher impact but also probably a much lower chance of messing up
(or having excess latency), since it doesn't involve the guest OS or the
qemu agent at all...
--
Hector Martin (hector@xxxxxxxxxxxxxx)
Public Key: https://marcan.st/marcan.asc
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com