I'd like to extend this discussion a little bit. We where told to do 1 OSD per Physical disk and to make shure that we will never ever lose some data, make 3 replicas. So raw disk capacity has to be devided by 3 to get usable capacity. Named vendors tell us that they get 80% usable capacity from the installed raw and it's safe as well with double parity and yes they do the fancy stuff like dedupe and compression to achieve this. But anyhow my boss is calculation 200isk per TB raw in a n+2 ceph cluster = 600isk per TB usable vendor X is 600isk per TB raw plus lot's of fancy stuff = 750isk per TB usable beside some good arguments for both solutions, we are also thinking if it's possible to lower the cost by adding some RAID below the OSD's and just do a n+1 replication in a 12 disk node we would lose 20% to raid6 so the calculation would look something like this: 200isk per TB raw = (200isk + 20isk) * 2 = 440isk per TB usable is this thinking into the wrong direction? regards Andi ________________________________________ Von: ceph-users-bounces@xxxxxxxxxxxxxx [ceph-users-bounces@xxxxxxxxxxxxxx]" im Auftrag von "Oliver Daudey [oliver@xxxxxxxxx] Gesendet: Sonntag, 1. September 2013 01:54 An: Dimitri Maziuk Cc: ceph-users@xxxxxxxxxxxxxx Betreff: Re: some newbie questions... On za, 2013-08-31 at 13:34 -0500, Dimitri Maziuk wrote: > On 2013-08-31 11:36, Dzianis Kahanovich wrote: > > Johannes Klarenbeek пишет: > > > >>> > >>> 1) i read somewhere that it is recommended to have one OSD per disk in a production environment. > >>> is this also the maximum disk per OSD or could i use multiple disks per OSD? and why? > >> > >> you could use multiple disks for one OSD if you used some striping and abstract the disk (like LVM, MDRAID, etc). But it wouldn't make sense. One OSD writes into one filesystem, that is usually one disk in a production environment. Using RAID under it wouldn't increase neither reliability nor performance drastically. > > > > I see some sense in RAID 0: single ceph-osd daemon per node (but still > > disk-per-osd self). But if you have relative few [planned] cores per task on > > node - you can think about it. > > Raid-0: single disk failure kills the entire filesystem, off-lines the > osd and triggers a cluster-wide resync. Actual raid: single disk failure > does not affect the cluster in any way. RAID-controllers also add a lot of manageability into the mix. The fact that a chassis starts beeping and indicates exactly which disk needs replacing, managing automatic rebuild after replacement, makes operations much easier, even by less technical personnel. Also, if you have fast disks and a good RAID-controller, it should offload the entire rebuild-process from the node's main CPU without a performance-hit on the Ceph-cluster or node. As already said, OSDs are expensive on the resources, too. Having too many of them on one node and then having an entire node fail, can cause a lot of traffic and load on the remaining nodes while things rebalance. Regards, Oliver _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com