Thats expected from Ceph by design. But in our case, we are using all recommendation like rack failure domain, replication n/w,etc, still face client IO performance issues during one OSD down.. On Tue, Feb 19, 2019 at 10:56 PM David Turner <drakonstein@xxxxxxxxx> wrote: > > With a RACK failure domain, you should be able to have an entire rack powered down without noticing any major impact on the clients. I regularly take down OSDs and nodes for maintenance and upgrades without seeing any problems with client IO. > > On Tue, Feb 12, 2019 at 5:01 AM M Ranga Swami Reddy <swamireddy@xxxxxxxxx> wrote: >> >> Hello - I have a couple of questions on ceph cluster stability, even >> we follow all recommendations as below: >> - Having separate replication n/w and data n/w >> - RACK is the failure domain >> - Using SSDs for journals (1:4ratio) >> >> Q1 - If one OSD down, cluster IO down drastically and customer Apps impacted. >> Q2 - what is stability ratio, like with above, is ceph cluster >> workable condition, if one osd down or one node down,etc. >> >> Thanks >> Swami >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com