On Wed, Aug 21, 2019 at 9:34 AM Florian Haas <florian@xxxxxxxxxxxxxx> wrote: > > Hi everyone, > > apologies in advance; this will be long. It's also been through a bunch > of edits and rewrites, so I don't know how well I'm expressing myself at > this stage — please holler if anything is unclear and I'll be happy to > try to clarify. > > I am currently in the process of investigating the behavior of OpenStack > Nova instances when being snapshotted and suspended, in conjunction with > qemu-guest-agent (qemu-ga). I realize that RBD-backed Nova/libvirt > instances are expected to behave differently from file-backed ones, but > I think I might have reason to believe that the RBD-backed ones are > indeed behaving incorrectly, and I'd like to verify that. > > So first up, for comparison, let's recap how a Nova/libvirt/KVM instance > behaves when it is *not* backed by RBD (such as, it's using a qcow2 file > that is on a Nova compute node in /var/lib/nova/instances), is booted > from an image with the hw_qemu_guest_agent=yes meta property set, and > runs qemu-guest-agent within the guest: > > - User issues "nova suspend" or "openstack server suspend". > > - If nova-compute on the compute node decides that the instance has > qemu-guest-agent running (which is the case if it's qemu or kvm, and its > image has hw_qemu_guest_agent=yes), it sends a guest-sync command over > the guest agent VirtIO serial port. This command registers in the > qemu-ga log file in the guest. > > - nova-compute on the compute node sends a libvirt managed-save command. > > - Nova reports the instance as suspended. > > - User issues "nova resume" or "openstack server resume". > > - nova-compute on the compute node sends a libvirt start command. > > - Again, if nova-compute on the compute node knows that the instance has > qemu-guest-agent running, it sends another command over the serial port, > namely guest-set-time. This, too, registers in the guest's qemu-ga log. > > - Nova reports the instance as active (running normally) again. > > > Now, when I instead use a Nova environment that is fully RBD-backed, I > see exactly the same behavior as described above. So I know that in > principle, nova-compute/qemu-ga communication works in both an > RBD-backed and a non-RBD-backed environment. > > > However, things appear to get very different when it comes to snapshots. > > > Again, starting with a file-backed environment: > > - User issues "nova image-create" or "openstack server image create". > > - If nova-compute on the compute node decides that the instance can be > quiesced (which is the case if it's qemu or kvm, and its image has > hw_qemu_guest_agent=yes), then it sends a "guest-fsfreeze-freeze" > command over the guest agent VirtIO serial port. > > - The guest agent inside the guest loops over all mounted filesystems, > and issues the FIFREEZE ioctl (which maps to the kernel freeze_super() > function). This can be seen in the qemu-ga log file in the guest, and it > is also verifiable by using ftrace on the qemu-ga PID and checking for > the freeze_super() function call. > > - nova-compute then takes a live snapshot of the instance. > > - Once complete, the guest gets a "guest-fsfreeze-thaw" command, and > again I can see this in the qemu-ga log, and with ftrace. > > > And now with RBD: > > - User issues "nova image-create" or "openstack server image create". > > - The guest-fsfreeze-freeze agent command never happens. > > Now I can see the info message from > https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2048 > in my nova-compute log, which confirms that we're attempting a live > snapshot. > > I also do *not* see the warning from > https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2068, > so it looks like the direct_snapshot() call from > https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2058 > succeeds. This is defined in > https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/imagebackend.py#L1055 > and it uses RBD functionality only. Importantly, it never interacts with > qemu-ga, so it appears to not worry at all about freezing the filesystem. > > (Which does seem to contradict > https://docs.ceph.com/docs/master/rbd/rbd-openstack/?highlight=uuid#image-properties, > by the way, so that may be a documentation bug.) > > Now here's another interesting part. Were the direct snapshot to fail, > if I read > https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2081 > and > https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/driver.py#L2144 > correctly, the fallback behavior would be as follows: The domain would > next be "suspended" (note, again this is Nova suspend, which maps to > libvirt managed-save per > https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac92bfe75dc59/nova/virt/libvirt/guest.py#L504), > then snapshotted using a libvirt call and resumed again post-snapshot. > In which case there would be a guest-sync call on suspend. > > And it's this part that has me a bit worried. If an RBD backed instance, > on a successful snapshot, never freezes its filesystem *and* never does > any kind of sync, either, doesn't that mean that such an instance can't > be made to produce consistent snapshots? (Particularly in the case of > write-back caching, which is recommended and normally safe for > RBD/virtio devices.) Or is there some magic within the Qemu RBD storage > driver that I am unaware of, that makes any such contortions unnecessary? It just looks like this was an oversight from the OpenStack developers when Nova RBD "direct" ephemeral image snapshot support was added [1]. I would open a bug ticket against Nova for the issue. > Thanks in advance for your insights! > > Cheers, > Florian > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx [1] https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517a9e1c55 -- Jason _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx