I could see that if you have size and min_size equal. Can you provide some details about your set up? The peering souks be pretty fast and if min_size < size then writes can happen without recovery.
Also if you are using KVM, I suggest using librbd instead of KRBD. If something funky happens with the VM or RBD, or is less likely to lock up the whole box.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
As far as I read as soon as a osd is marked down writes won't recover because pgs have to be peered and the object has to be recovered before being written. We got kernel hung task timeout on a bunch of vms when a ceph node was taken down.
On Jan 4, 2016 11:04 AM, "Robert LeBlanc" <robert@xxxxxxxxxxxxx> wrote:I'm not sure what you mean by transparent? Does the IO hang forever when a node goes down? If an OSD is taken down gracefully then there should be minimal disruption of traffic. If you yank the network or power cables, it can take 30 seconds before the cluster considers it down to mark it bad.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
On Jan 3, 2016 9:04 PM, "Kalyana sundaram" <kalyanceg@xxxxxxxxx> wrote:Hi
We are building our private cloud. We have decided to use ceph to provide features like ebs. How can we make it transparent for vms when one ceph node goes down. Because when one ceph node goes down we will lose a set of osds and thereby set of pgs have to be recovered. Clients' read and write might fail till crush map is updated and recovery process accepts same state of objects. This hangs writes and reads on vms. How can we make it transparent to vms.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com