Hi everybody, Our need is to do VM failover using an image disk over RBD to avoid data loss.We want to limit the downtime as much as possible. We have: - Two hypervisors with a Ceph Monitor and a Ceph OSD. - A third machine with a Ceph Monitor and a Ceph Manager. VM are running over qemu.The VM disks are on a "replicated" rbd pool formed by the two OSDs.Ceph version: NautilusDistribution: Yocto Zeus The following test is performed: we electrically turn off one hypervisor (and therefore a Ceph Monitor and a Ceph OSD), which causes its VMs to switch to the second hypervisor. My main issue is that the mount time of a partition in rw is very slow in the case of a failover (after the loss of an OSD its monitor). With failover we can write on the device after ~25s:[ 25.609074] EXT4-fs (vda3): mounted filesystem with ordered data mode. Opts: (null) In normal boot we can write on the device after ~4s:[ 3.087412] EXT4-fs (vda3): mounted filesystem with ordered data mode. Opts: (null) I wasn't able to reduce this time by tweaking Ceph settings. I am wondering if someone could help me on that. Here is our configuration. ceph.conf[global] fsid = fa7a17d1-5351-459e-bf0e-07e7edc9a625 mon initial members = hypervisor1,hypervisor2,observer mon host = 192.168.217.131,192.168.217.132,192.168.217.133 public network = 192.168.217.0/24 auth cluster required = cephx auth service required = cephx auth client required = cephx osd journal size = 1024 osd pool default size = 2 osd pool default min size = 1 osd crush chooseleaf type = 1 mon osd adjust heartbeat grace = false mon osd min down reporters = 1[mon.hypervisor1] host = hypervisor1 mon addr = 192.168.217.131:6789[mon.hypervisor2] host = hypervisor2 mon addr = 192.168.217.132:6789[mon.observer] host = observer mon addr = 192.168.217.133:6789[osd.0] host = hypervisor1 public_addr = 192.168.217.131 cluster_addr = 192.168.217.131[osd.1] host = hypervisor2 public_addr = 192.168.217.132 cluster_addr = 192.168.217.13 # ceph config dump WHO MASK LEVEL OPTION VALUE RO global advanced mon_osd_adjust_down_out_interval false global advanced mon_osd_adjust_heartbeat_grace false global advanced mon_osd_down_out_interval 5 global advanced mon_osd_report_timeout 4 global advanced osd_beacon_report_interval 1 global advanced osd_heartbeat_grace 2 global advanced osd_heartbeat_interval 1 global advanced osd_mon_ack_timeout 1.000000 global advanced osd_mon_heartbeat_interval 2 global advanced osd_mon_report_interval 3 Thanks _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx