risk mitigation in 2 replica clusters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

I'm doing some work to evaluate the risks involved in running 2r storage pools. On the face of it my naive disk failure calculations give me 4-5 nines for a 2r pool of 100 OSDs (no copyset awareness, i.e., secondary disk failure based purely on chance of any 1 of the remaining 99 OSDs failing within recovery time). 5 nines is just fine for our purposes, but of course multiple disk failures are only part of the story.

The more problematic issue with 2r clusters is that any time you do planned maintenance (our clusters spend much more time degraded because of regular upkeep than because of real failures) you're suddenly drastically increasing the risk of data-loss. So I find myself wondering if there is a way to tell Ceph I want an extra replica created for a particular PG or set thereof, e.g., something that would enable the functional equivalent of: "this OSD/node is going to go offline so please create a 3rd replica in every PG it is participating in before we shutdown that/those OSD/s"...?

--
Cheers,
~Blairo
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux