On Tue, 3 Feb 2015 15:16:57 +0000 Colombo Marco wrote: > Hi all, > I have to build a new Ceph storage cluster, after i‘ve read the > hardware recommendations and some mail from this mailing list i would > like to buy these servers: > Nick mentioned a number of things already I totally agree with, so don't be surprised if some of this feels like a repeat. > OSD: > SSG-6027R-E1R12L -> > http://www.supermicro.nl/products/system/2U/6027/SSG-6027R-E1R12L.cfm > Intel Xeon e5-2630 v2 64 GB RAM As nick said, v3 and more RAM might be helpful, depending on your use case (small writes versus large ones) even faster CPUs as well. > LSI 2308 IT > 2 x SSD Intel DC S3700 400GB > 2 x SSD Intel DC S3700 200GB Why the separation of SSDs? They aren't going to be that busy with regards to the OS. Get a case like Nick mentioned with 2 2.5 bays in the back, put 2 DC S3700 400GBs in there (connected to onboard 6Gb/s SATA3), partition them so that you have a RAID1 for OS and plain partitions for the journals of the now 12 OSD HDDs in your chassis. Of course this optimization in terms of cost and density comes with a price, if one SSD should fail, you will have 6 OSDs down. Given how reliable the Intels are this is unlikely, but something you need to consider. If you want to limit the impact of a SSD failure and have just 2 OSD journals per SSD, get a chassis like the one above and 4 DC S3700 200GB, RAID10 them for the OS and put 2 journal partitions on each. I did the same with 8 3TB HDDs and 4 DC S3700 100GB, the HDDs (and CPU with 4KB IOPS), are the limiting factor, not the SSDs. > 8 x HDD Seagate Enterprise 6TB Are you really sure you need that density? One disk failure will result in a LOT of data movement once these become somewhat full. If you were to go for a 12 OSD node as described above, consider 4TB ones for the same overall density, while having more IOPS and likely the same price or less. > 2 x 40GbE for backend network You'd be lucky to write more that 800MB/s sustained to your 8 HDDs (remember they will have to deal with competing reads and writes, this is not a sequential synthetic write benchmark). Incidentally 1GB/s to 1.2GB/s (depending on configuration) would also be the limit of your journal SSDs. Other than backfilling caused by cluster changes (OSD removed/added), your limitation is nearly always going to be IOPS, not bandwidth. So 2x10GbE or if you're comfortable with it (I am ^o^) an Infiniband backend (can be cheaper, less latency, plans for RDMA support in Ceph) should be more than sufficient. > 2 x 10GbE for public network > > META/MON: > > SYS-6017R-72RFTP -> > http://www.supermicro.com/products/system/1U/6017/SYS-6017R-72RFTP.cfm 2 > x Intel Xeon e5-2637 v2 4 x SSD Intel DC S3500 240GB raid 1+0 You're likely to get better performance and of course MUCH better durability by using 2 DC S3700, at about the same price. > 128 GB RAM Total overkill for a MON, but I have no idea about MDS and RAM never hurts. In your follow-up you mentioned 3 mons, I would suggest putting 2 more mons (only, not MDS) on OSD nodes and make sure that within the IP numbering the "real" mons have the lowest IP addresses, because the MON with the lowest IP becomes master (and thus the busiest). This way you can survive a loss of 2 nodes and still have a valid quorum. Christian > 2 x 10 GbE > > What do you think? > Any feedbacks, advices, or ideas are welcome! > > Thanks so much > > Regards, -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com