As far as I read as soon as a osd is marked down writes won't recover because pgs have to be peered and the object has to be recovered before being written. We got kernel hung task timeout on a bunch of vms when a ceph node was taken down.
I'm not sure what you mean by transparent? Does the IO hang forever when a node goes down? If an OSD is taken down gracefully then there should be minimal disruption of traffic. If you yank the network or power cables, it can take 30 seconds before the cluster considers it down to mark it bad.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
On Jan 3, 2016 9:04 PM, "Kalyana sundaram" <kalyanceg@xxxxxxxxx> wrote:Hi
We are building our private cloud. We have decided to use ceph to provide features like ebs. How can we make it transparent for vms when one ceph node goes down. Because when one ceph node goes down we will lose a set of osds and thereby set of pgs have to be recovered. Clients' read and write might fail till crush map is updated and recovery process accepts same state of objects. This hangs writes and reads on vms. How can we make it transparent to vms.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com