Re: Planning a home ceph cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 02/12/2014 06:23 AM, Ethan Levine wrote:

On 2014-02-11 11:41 PM, Mark Nelson wrote:

I've been planning building myself a server cluster as a sort of hobby
project, and I've decided to use Ceph for its storage system. I have a
few questions, though.

My plan is to build 3 relatively dense servers (20 drive bays each) and
fill each one with relatively consumer equipment (AMD 8-core FX
processor, 24+ GB ECC RAM, and a decent SAS card that can provide a
channel to each drive).  For drives, I was planning on using 3 TB or 4
TB WD Red drives (fairly cheap but should be reliable).  I'm only
budgeting ~$7500 for it, so I'll only populate 5 drives per node from
the get-go, but I can just fill them up as my storage requirements grow.

I'd suggest considering smaller servers with fewer drives to spread
things out a bit more if you can and grow servers rather than growing
more disks in those 3 servers.  Not sure if you are going for
rackmount or not.  There's plenty of rackmount case options from
supermicro/etc, but if you are going for a more consumer oriented case
these look interesting:

http://www.u-nas.com/xcart/product.php?productid=17617&cat=0&featured=Y

I was indeed going for a rackmount option.  My current plan uses the
Norco RPC-4220, which stuffs 20 hotswap bays into a 4U enclosure. I've
read many, many reviews of Norco cases, and there seem to be mixed
feelings about them, but they're the only option (that I've found) in my
price range.  For reference, the RPC-4220 is only $330 from Newegg at
the moment.

In terms of price/storage, that chassis is really quite good, probably the best you can do for a rack mount case. Especially considering that it appears to take standard power supplies (figure total cost will come in at around $400-450 with a decent brand power supply. Of course having non-rudundant power supplies in a case with 20 drives is a bit of a bummer.

I think the trick for you will be to figure out how much saving money on the case and motherboard is worth it to you vs having more nodes in your cluster. Probably the next best options are something like:

chenbro 12-bay 2U:

http://www.amazon.com/Chenbro-Mini-SAS-Backplane-Chassis-RM23612M2-L/dp/B00DVFMFLS/ref=pd_sim_sbs_pc_3

norco 12-bay 2U:

http://www.amazon.com/Rackmount-Server-Case-Hot-Swappable-Drive/dp/B004IXTYB6/ref=sr_1_2?s=electronics&ie=UTF8&qid=1392217499&sr=1-2&keywords=rpc-2212

chenbro 8-bay 2U:

http://www.amazon.com/Chenbro-Computing-Storage-Chassis-RM23608M2-L/dp/B00DVFMDI8/ref=sr_1_9?ie=UTF8&qid=1392217092&sr=8-9&keywords=chenbro+2U

If you go with the 4U or a 2U case with horizontal expansion cards, you can do one or more of these too:

http://www.amazon.com/Syba-Mount-Mobile-2-5-Inch-SY-MRA25023/dp/B0080V73RE/ref=cm_cr_pr_sims_t

If you use the 2U cases and get a supermicro motherboard with on-board SAS2308, that would let you use two of those + your network card. In the 4U case you could potentially use several of them.


One of the variants that I planned and priced out was for 2U, 12-bay
nodes.  I originally threw out the idea for cost reasons, and because
building in a 2U case is more difficult than building in a 4U case, but
after reading your response, I'm going to take another look at it.  With
12-bay nodes, do you think a 2x or 3x 1 Gbps link would suffice?

 * I'm planning on having either 3x or 5x 1 Gbps ethernet port on each
node, with a decent managed switch.  I should be able to aggregate these
lines however I wish - say, either use just a single 5 Gbps connection
to the switch, or split it into a 2 Gbps front-end connection and 3 Gbps
back-end connection.  I would value any input on which configuration
would likely be best.  Both fiber and 10 Gbps copper are outside of my
price range.

You may find that link aggregation is less efficient once you go
beyond 2 links.  How you set this up really depends on your goals and
what kind of read/write workload you end up with.  If you are write
heavy and have 2x replication or more, having a separate backend
network can be nice.

Yeah, I've never used link aggregation before, so this is totally new
territory for me.

Overall, I expect to have many more reads than writes.

My biggest concern in this area was that if I was accessing data on the
cluster through a VM (such as a web server or Samba server), than the VM
traffic would probably also be on the "front-end" network.  As a result,
a read would pull from all 3 nodes to a single node (wherever the VM was
running), and then from that node to the client, which may put too much
stress on the front-end network.

   * I'm planning on buying a single SSD for each node for the OS and
journals.  As I populated the nodes, I was going to buy a second SSD,
and split each SSD into two partitions - so I can have a RAID 1
partition for the OS and a larger RAID 0 partition for the journals.  Is
this unwise?  Will two SSDs be able to provide enough throughput and
IOPS for 20 journals, or do I need to plan for more?

I would strongly suggest you not put 20 journals on a single SSD. If
the SSD dies you lose all of the OSDs in your node all at once.  If
you are doing a lot of writes, you may be really hammering those SSDs
too. As far as performance goes, it'll limit your sequential write
performance, but probably not as much as your network with 20 drives
per node.  For small random writes, it probably depends on the SSD
used. I'd suggest probably looking at the Crucial M500.

Oh, this is a good point that I didn't consider - with RAID 0, a single
SSD failure brings down the node.  Would 6 OSDs per SSD be more
reasonable?  Getting more than 2 SSDs in the nodes would prove a
challenge (since I've been planning on using 4 of most motherboard's 6
SATA III ports to drive one of the backplanes of hard drives).  I've
actually used Crucial SSDs in my last two desktop builds with no
complaints, so I'll definitely look at them for this.

Thanks for your help!

- Ethan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux