How do rados get to data block if primary OSD is out?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have been reading the architecture section of ceph document. One thing has not been clear to me is how the data HA works when we encounter OSD or server failure. Does the Crush algorithm recalculate based on the new cluster map and point the data to the 2nd or 3rd replica for existing data block read or write? Given the 2nd or 3rd replica's location (OSDs) were calculated by the primary OSD instead of the client, this is not clearly to me if and how this is being done.

A related question to the data HA mechanism, if client (librados) does recalculate the primary OSD location and point to the 2nd OSD, how much latency or how long the IO hang will client (e.g., VM) experience on an average load scenario? In our traditional commercial hypervisor environment, we experienced SCSI time out and Linux guest OS file system turns to readonly mode due to NFS Datastore/Network hiccups.

Thanks. --weiguo

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux