On 11.12.2018 12:59, Kevin Olbrich wrote:
Hi! Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous cluster (which already holds lot's of images). The server has access to both local and cluster-storage, I only need to live migrate the storage, not machine. I have never used live migration as it can cause more issues and the VMs that are already migrated, had planned downtime. Taking the VM offline and convert/import using qemu-img would take some hours but I would like to still serve clients, even if it is slower. The VM is I/O-heavy in terms of the old storage (LSI/Adaptec with BBU). There are two HDDs bound as RAID1 which are constantly under 30% - 60% load (this goes up to 100% during reboot, updates or login prime-time). What happens when either the local compute node or the ceph cluster fails (degraded)? Or network is unavailable? Are all writes performed to both locations? Is this fail-safe? Or does the VM crash in worst case, which can lead to dirty shutdown for MS-EX DBs?
the disk is on the source location untill the migration is finalized. if the local compute node crashed and the vm dies with it before the migration is done. the disk is on the source location as expected. if nodes on the ceph cluster dies but the cluster is operational, ceph just selfheal and the migration is finished. if the cluster dies hard enough to actually break, the migration will timeout , and abort. and disk remains on source location. if network is unavailable the transfer will also timeout.
good luck Ronny Aasen _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com