Re: Planning a home ceph cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2014-02-11 11:41 PM, Mark Nelson wrote:

I've been planning building myself a server cluster as a sort of hobby
project, and I've decided to use Ceph for its storage system. I have a
few questions, though.

My plan is to build 3 relatively dense servers (20 drive bays each) and
fill each one with relatively consumer equipment (AMD 8-core FX
processor, 24+ GB ECC RAM, and a decent SAS card that can provide a
channel to each drive).  For drives, I was planning on using 3 TB or 4
TB WD Red drives (fairly cheap but should be reliable).  I'm only
budgeting ~$7500 for it, so I'll only populate 5 drives per node from
the get-go, but I can just fill them up as my storage requirements grow.

I'd suggest considering smaller servers with fewer drives to spread things out a bit more if you can and grow servers rather than growing more disks in those 3 servers. Not sure if you are going for rackmount or not. There's plenty of rackmount case options from supermicro/etc, but if you are going for a more consumer oriented case these look interesting:

http://www.u-nas.com/xcart/product.php?productid=17617&cat=0&featured=Y

I was indeed going for a rackmount option. My current plan uses the Norco RPC-4220, which stuffs 20 hotswap bays into a 4U enclosure. I've read many, many reviews of Norco cases, and there seem to be mixed feelings about them, but they're the only option (that I've found) in my price range. For reference, the RPC-4220 is only $330 from Newegg at the moment.

One of the variants that I planned and priced out was for 2U, 12-bay nodes. I originally threw out the idea for cost reasons, and because building in a 2U case is more difficult than building in a 4U case, but after reading your response, I'm going to take another look at it. With 12-bay nodes, do you think a 2x or 3x 1 Gbps link would suffice?

 * I'm planning on having either 3x or 5x 1 Gbps ethernet port on each
node, with a decent managed switch.  I should be able to aggregate these
lines however I wish - say, either use just a single 5 Gbps connection
to the switch, or split it into a 2 Gbps front-end connection and 3 Gbps
back-end connection.  I would value any input on which configuration
would likely be best.  Both fiber and 10 Gbps copper are outside of my
price range.

You may find that link aggregation is less efficient once you go beyond 2 links. How you set this up really depends on your goals and what kind of read/write workload you end up with. If you are write heavy and have 2x replication or more, having a separate backend network can be nice.

Yeah, I've never used link aggregation before, so this is totally new territory for me.

Overall, I expect to have many more reads than writes.

My biggest concern in this area was that if I was accessing data on the cluster through a VM (such as a web server or Samba server), than the VM traffic would probably also be on the "front-end" network. As a result, a read would pull from all 3 nodes to a single node (wherever the VM was running), and then from that node to the client, which may put too much stress on the front-end network.

   * I'm planning on buying a single SSD for each node for the OS and
journals.  As I populated the nodes, I was going to buy a second SSD,
and split each SSD into two partitions - so I can have a RAID 1
partition for the OS and a larger RAID 0 partition for the journals.  Is
this unwise?  Will two SSDs be able to provide enough throughput and
IOPS for 20 journals, or do I need to plan for more?

I would strongly suggest you not put 20 journals on a single SSD. If the SSD dies you lose all of the OSDs in your node all at once. If you are doing a lot of writes, you may be really hammering those SSDs too. As far as performance goes, it'll limit your sequential write performance, but probably not as much as your network with 20 drives per node. For small random writes, it probably depends on the SSD used. I'd suggest probably looking at the Crucial M500.

Oh, this is a good point that I didn't consider - with RAID 0, a single SSD failure brings down the node. Would 6 OSDs per SSD be more reasonable? Getting more than 2 SSDs in the nodes would prove a challenge (since I've been planning on using 4 of most motherboard's 6 SATA III ports to drive one of the backplanes of hard drives). I've actually used Crucial SSDs in my last two desktop builds with no complaints, so I'll definitely look at them for this.

Thanks for your help!

- Ethan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux