Write access delay after OSD & Mon lost

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everybody,
Our need is to do VM failover using an image disk over RBD to avoid data loss.We want to limit the downtime as much as
possible.
We have: - Two hypervisors with a Ceph Monitor and a Ceph OSD. - A third machine with a Ceph Monitor and a Ceph
Manager. 
VM are running over qemu.The VM disks are on a "replicated" rbd pool formed by the two OSDs.Ceph version:
NautilusDistribution: Yocto Zeus
The following test is performed: we electrically turn off one hypervisor (and therefore a Ceph Monitor and a Ceph OSD),
which causes its VMs to switch to the second hypervisor.
My main issue is that the mount time of a partition in rw is very slow in the case of a failover (after the loss of an
OSD its monitor).
With failover we can write on the device after ~25s:[   25.609074] EXT4-fs (vda3): mounted filesystem with ordered data
mode. Opts: (null)
In normal boot we can write on the device after ~4s:[    3.087412] EXT4-fs (vda3): mounted filesystem with ordered data
mode. Opts: (null)
I wasn't able to reduce this time by tweaking Ceph settings. I am wondering if someone could help me on that.
Here is our configuration.
ceph.conf[global]    fsid = fa7a17d1-5351-459e-bf0e-07e7edc9a625    mon initial members =
hypervisor1,hypervisor2,observer    mon host = 192.168.217.131,192.168.217.132,192.168.217.133    public network =
192.168.217.0/24    auth cluster required = cephx    auth service required = cephx    auth client required =
cephx    osd journal size = 1024    osd pool default size = 2    osd pool default min size = 1    osd crush chooseleaf
type = 1    mon osd adjust heartbeat grace = false    mon osd min down reporters = 1[mon.hypervisor1]    host =
hypervisor1    mon addr = 192.168.217.131:6789[mon.hypervisor2]    host = hypervisor2    mon addr =
192.168.217.132:6789[mon.observer]    host = observer    mon addr = 192.168.217.133:6789[osd.0]    host =
hypervisor1    public_addr = 192.168.217.131    cluster_addr = 192.168.217.131[osd.1]    host =
hypervisor2    public_addr = 192.168.217.132    cluster_addr = 192.168.217.13
# ceph config dump WHO    MASK LEVEL    OPTION                           VALUE    RO global      advanced
mon_osd_adjust_down_out_interval false       global      advanced
mon_osd_adjust_heartbeat_grace   false       global      advanced
mon_osd_down_out_interval        5           global      advanced
mon_osd_report_timeout           4            global      advanced
osd_beacon_report_interval       1           global      advanced
osd_heartbeat_grace              2           global      advanced
osd_heartbeat_interval           1           global      advanced
osd_mon_ack_timeout              1.000000    global      advanced
osd_mon_heartbeat_interval       2           global      advanced osd_mon_report_interval          3 
Thanks
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux