kvm / virsh snapshot management

Gary Dale <gary@xxxxxxxxxxxxxxxxx> · Sat, 1 Jun 2019 20:12:01 -0400

A while back I converted a raw disk image to qcow2 to be able to use 
snapshots. However I realize that I may not really understand exactly 
how snapshots work. In this particular case, I'm only talking about 
internal snapshots currently as there seems to be some differences of 
opinion as to whether internal or external are safer/more reliable. I'm 
also only talking about shutdown state snapshots, so it should just be 
the disk that is snapshotted.

As I understand it, the first snapshot freezes the base image and 
subsequent changes in the virtual machine's disk are stored elsewhere in 
the qcow2 file (remember, only internal snapshots). If I take a second 
snapshot, that freezes the first one, and subsequent changes are now in 
third location. Each new snapshot is incremental to the one that 
preceded it rather than differential to the base image. Each new 
snapshot is a child of the previous one.

One explanation I've seen of the process is if I delete a snapshot, the 
changes it contains are merged with its immediate child. So if I deleted 
the first snapshot, the base image stays the same but any data that has 
changed since the base image is now in the second snapshot's location. 
The merge with children explanation also implies that the base image is 
never touched even if the first snapshot is deleted.

But if I delete a snapshot that has no children, is that essentially the 
same as reverting to the point that snapshot was created and all 
subsequent disk changes are lost? Or does it merge down to the parent 
snapshot? If I delete all snapshots, would that revert to the base image?

I've seen it explained that a snapshot is very much like a timestamp so 
deleting a timestamp removes the dividing line between writes that 
occurred before and after that time, so that data is really only removed 
if I revert to some time stamp - all writes after that point are 
discarded. In this explanation, deleting the oldest timestamp is 
essentially updating the base image. Deleting all snapshots would leave 
me with the base image fully updated.

Frankly, the second explanation sounds more reasonable to me, without 
having to figure out how copy-on-write works,  But I'm dealing with 
important data here and I don't want to mess it up by mishandling the 
snapshots.

Can some provide a little clarity on this? Thanks!