New User Q: General config, massive temporary OSD loss

Edward Huyer <erhvks@xxxxxxx> · Tue, 18 Jun 2013 13:34:09 -0400

Hi, I’m an admin for the School of Interactive Games and Media at RIT, and looking into using ceph to reorganize/consolidate the storage my department is using.  I’ve read a lot of documentation and comments/discussion on the web, but I’m not 100% sure what I’m looking at doing is a good use of ceph.  I was hoping to get some input on that, as well as an answer to a more specific question about OSDs going offline.

First questions:  Are there obvious flaws or concerns with the following configuration I should be aware of?  Does it even make sense to try to use ceph here?  Anything else I should know, think about, or do instead of the above?

My more specific question relates to the two RAID controllers in the MD3200, and my intended 2 or 3 copy replication (also striping):  What happens if all OSDs with copies of a piece of data go down for a period of time, but then the OSDs come back “intact” (e.g. by moving them to a different controller)?

I know this can be limited or prevented entirely using good failure domain organization, but it’s still a question I haven’t found an answer to.

Sorry  for the wall o’ text, and thanks in advance for any help or advice you can provide.

Proposed Configuration/Architecture:

I’ll note that most of the post-implementation utilization would be in the form of RBDs mounted over 10Gbit Ethernet, which will then be used for file storage and/or KVM virtual drives.  Once it’s stable enough for production I’d like to use cephFS for file storage, but not just yet.

Currently, we have a Dell MD3200 and two attached MD1200 set up as a sort of mini-SAN, which is then exporting chunks of block to various servers via SAS.  The whole assemblage has a total capacity of 48TB spread across 36 disks (24x 1TB drives and 12x 3TB drives).  We also have several TB of storage scattered across 10ish drives in 2-3 actual servers.  This is all raw capacity on 7200RPM drives.  

There are a few problem with this configuration:
-          Expanding storage in a usable way is fairly difficult, especially if the drives don’t match existing drive sizes in the MD SAN.
-          It is relatively easy to end up with “slack” in the storage; large chunks of storage that can’t be easily allocated or reallocated.
-          Limited number of systems able to directly access the storage (due to limited SAS ports)
-          Difficult to upgrade (massive manual data migration needed)
-          2 points of failure (exactly 2 RAID controllers in the MD3200; ceph wouldn’t solve this problem immediately)

My goals are to have easily expandable and upgradable storage for several servers (mainly, but not entirely, virtual), eliminate the “slack” in the current configuration, and to be able to relatively easily migrate away from the MD pseudo-SAN in the future.  I’m also looking to lay the infrastructure ground work for an OpenStack cloud implementation or similar.

My notion is to allocate and split the contents of the Dell MD array as individual drives to 3-4 servers (6-9 drives per server), which will then configure each drive as an OSD using XFS.  I’d likely set up SSDs as journaling drives, each containing journals for ~6 OSDs (10-20GB journal per OSD).  The servers would have appropriate RAM (1+GB per OSD, 1-2GB for the monitors), and would be 4-6 core Core i-generation Xeons.

The back-end communication network for ceph would be 10GB Ethernet.  I may have to have the ceph clients read their data from that back-end network as well; I realize this is probably not ideal, but I’m hoping/thinking it will be good enough.

-----
Edward Huyer
School of Interactive Games and Media
Golisano 70-2373
152 Lomb Memorial Drive
Rochester, NY 14623
585-475-6651
erhvks@xxxxxxx

Obligatory Legalese:
The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and destroy any copies of this information.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com