Re: Live migrate RBD image with a client using it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



For KVM virtual machines, one of my coworkers worked out a way to live migrate a VM with its storage to another node on the same cluster with storage in a different pool. This requires that the VM can be live migrated to a new host that has access to the same Ceph cluster, and enough bandwidth between the two hosts (or an idle enough VM) that the RAM and disk replication can keep up with the operation of the server. This probably originated from a webpage somewhere, but I didn't find it after several attempts.

The steps were:

1) Create an empty RBD in the new pool that is the same size as the RBD in the existing pool

qemu-img info rbd:sourcepool/rbdname
qemu-img create -f rbd rbd:targetpool/rbdname <size>G

2) Dump a copy of the libvirt XML file for the virtual machine

virsh dumpxml vmname > vmname-migrate.xml

3) Edit the dumped XML file to update the pool name

<source protocol='rbd' name='targetpool/rbdname'>

4) Migrate the virtual machine to another host in the same cluster

virsh migrate --live --persistent --copy-storage-all --verbose --xml vmname-migrate.xml vmname qemu+ssh://target_host/system

5) After migration is complete and you've confirmed the VM is running on the target host, delete or rename the rbd image on the old pool to make sure the VM cannot accidentally boot from it again later

rbd mv sourcepool/rbdname sourcepool/rbdname-old-migrated

6) Check the VM definition on the target node to make sure the pool was updated. I think we had cases where after the migration the VM was running fine but the configuration still referred to the old pool, so when we rebooted the VM it tried to find the image on the old pool again.

7) A good test to run after the migration is complete might be to create a snapshot of the RBD image on the target pool, create a clone of the VM that will boot from that snapshot (clone VM network disconnected or removed to avoid an IP conflict), and make sure it can boot successfully. I don't know if this was related, but we had two VMs around this time period that lost their boot sector or partition table or something like that. One of them I corrected by running testdisk, the other another coworker was able to quickly rebuild from a backup.

This same general approach should work regardless of if the source and target images in the XML are using protocol=rbd, rbd map, mounted on iscsi, or even a different storage/replication method entirely like a qcow image, LVM partition, or DRBD. These will involve different steps to find the source image size and create the destination image, and different changes to the vmname-migrate.xml disk section. We have used this approach to migrate a VM between two Ceph clusters, for example, which involved also changing the monitor IP entries in the XML.


On 4/12/23 05:12, Work Ceph wrote:
Hello guys,

We have been reading the docs, and trying to reproduce that process in our
Ceph cluster. However, we always receive the following message:


```

librbd::Migration: prepare: image has watchers - not migrating

rbd: preparing migration failed: (16) Device or resource busy

```


We tested with both RDB blocks mounted with kRBD, and with librbd via
KVM/qEMU system. Both cases result in the same result. For KRBD, we
understood that it is not supported right now, but for librbd it seems that
it should be supported somehow.


How do you guys handle those situations?


We have the following use cases that might need an image to migrate between
pools while the client is still consuming it.

    - RBD images that are consumed via the iSCSI gateways
    - RBD images mounted (rbd map) in hosts
    - RBD images used by KVM/Libvirt


Does Ceph support a live migration of images between pools while the
clients/consumers are still using those volumes?




[1]https://docs.ceph.com/en/quincy/rbd/rbd-live-migration/
_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx

--
Nelson Hicks
Information Technology
SOCKET
(573) 817-0000 ext. 210
nelsonh@xxxxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux