Hi, On Saturday 11 May 2013 16:04:27 Leen Besselink wrote: > Someone is going to correct me if I'm wrong, but I think you misread > something. > > > The Mon-daemon doesn't need that much RAM: > > The 'RAM: 1 GB per daemon' is per Mon-daemon, not per OSD-daemon. > Gosh, I feel embarresed. This ectually was my main concern / bottleneck. Thanks for pointing this out. Seems Ceph really rocks in deploying affordable data clusters. Regards, Tim > On Sat, May 11, 2013 at 03:42:59PM +0200, Tim Mohlmann wrote: > > Hi, > > > > First of all I am new to ceph and this mailing list. At this moment I am > > looking into the possibilities to get involved in the storage business. I > > am trying to get an estimate about costs and after that I will start to > > determine how to get sufficient income. > > > > First I will describe my case, at the bottom you will find my questions. > > > > > > GENERAL LAYOUT: > > > > Part of this cost calculation is of course hardware. For the larger part > > I've already figured it out. In my plans I will be leasing a full rack > > (46U). Depending on the domestic needs I will be using 36 or 40U for ODS > > storage servers. (I will assume 36U from here on, to keep a solid value > > for calculation and have enough spare space for extra devices). > > > > Each OSD server uses 4U and can take 36x3.5" drives. So in 36U I can put > > 36/4=9 OSD servers, containing 9*36=324 HDDs. > > > > > > HARD DISK DRIVES > > > > I have been looking for WD digital RE and RED series. RE is more expensive > > per GB, but has a larger MTBF and offers a 4TB model. RED is (real) cheap > > per GB, but only goes as far a 3TB. > > > > At my current calculations it does not matter much if I would put > > expensive WD RE 4TB disks or cheaper WD RED 3TB, the price per GB over > > the complete cluster expense and 3 years of running costs (including AFR) > > is almost the same. > > > > So basically, if I could reduce the costs of all the other components used > > in the cluster, I would go for the 3TB disk and if the costs will be > > higher then my first calculation, I would use the 4TB disk. > > > > Let's assume 4TB from now on. So, 4*324=1296TB. So lets go Peta-byte ;). > > > > > > NETWORK > > > > I will use a redundant 2x10Gbe network connection for each node. Two > > independent 10Gbe switches will be used and I will use bonding between the > > interfaces on each node. (Thanks some guy in the #Ceph irc for pointing > > this option out). I will use VLAN's to split front-side, backside and > > Internet networks. > > > > > > OSD SERVER > > > > SuperMicro based, 36 HDD hotswap. Dual socket mainboard. 16x DIMM sockets. > > It is advertised they can take up to 512GB of RAM. I will install 2 x > > Intel Xeon E5620 2.40ghz processor, having 4 cores and 8 threads each. > > For the RAM I am in doubt (see below). I am looking into running 1 OSD > > per disk. > > > > > > MON AND MDS SERVERS > > > > Now comes the big question. What specs are required? It first I had the > > plan to use 4 SuperMicro superservers, with a 4 socket mainboards that > > contain up to the new 16core AMD processors and up to 1TB of RAM. > > > > I want all 4 of the servers to run a MON service, MDS service and costumer > > / public services. Probably I would use VM's (kvm) to separate them. I > > will compile my own kernel to enable Kernel Samepage Merge, Hugepage > > support and memory compaction to make RAM use more efficient. The > > requirements for my public services will be added up, once I know what I > > need for MON and MDS. > > > > > > RAM FOR ALL SERVERS > > > > So what would you estimate to be the ram usage? > > http://ceph.com/docs/master/install/hardware-recommendations/#minimum- > > hardware-recommendations. > > > > Sounds OK for the OSD part. 500 MB per daemon, would put the minimum RAM > > requirement for my OSD server to 18GB. 32GB should be more then enough. > > Although I would like to see if it is possible to use btrfs compression? > > In > > that case I'd need more RAM in there. > > > > What I really want to know: how many RAM do I need for MON and MDS > > servers? > > 1GB per daemon sounds pretty steep. As everybody knows, RAM is expensive! > > > > In my case I would need at least 324 GB of ram for each of them. Initially > > I was planning to use 4 servers and each of them running both. Joining > > those in a single system, with the other duties the system has to perform > > I would need the full 1TB of RAM. I would need to use 32GB modules witch > > are really expensive per GB and difficult to find. (not may server > > hardware vendors in the Netherlands have them). > > > > > > QUESTIONS > > > > Question 1: Is it really the amount for OSD's that counts for MON and MDS > > RAM usage, or the size of the object store? > > > > Question 2: can I do it with less RAM? Any statistics, or better: a > > calculation? I can imagine memory pages becoming redundant if the cluster > > grows, so less memory required per OSD. > > > > Question 3: If it is the amount of OSDs that counts, would it be > > beneficial to combine disks in a RAID 0 (lvm or btrfs) array? > > > > Question 4: Is it safe / possible to store MON files inside of the cluster > > itself? The 10GB per daemon requirement would mean I need 3240GB of > > storage > > for each MON, meaning I need to get some huge disks and a (lvm) RAID 1 > > array for redundancy, while I have a huge redundant file sytem at hand > > already. > > > > Question 5: Is it possible to enable btrfs compression? I know btrfs is > > not > > stable for production yet, but it would be nice if compression is > > supported in the future, when it does become stable > > > > If the RAM requirement is not so steep, I am thinking about the > > possibility to run the MON service from 4 OSD servers. Upgrading them to > > 16x16GB of RAM would give me 256GB of RAM. (Again, 32GB modules are to > > expensive and not an option). This would obsolete 2 superservers, > > decreasing their workload, and keep some spare computing power for future > > growth. The only reason I needed them is for RAM capacity. > > > > Getting rid of 2 superservers, will provide me with enough space to fit a > > 10th storage server. This will considerable reduce the total cost per GB > > of this cluster. (Comparing all the hardware without the HDDs, the 4 > > superservers are the most expensive part) > > > > I completely understand if you think: hey! that kind of things should be > > corporate advise etc. Please understand I am just an individual, just > > working his (non-IT) job and has Linux and open-source as a hobby. I just > > started brainstorming on some business opportunities. If this story would > > be feasible, I would use this information to make a business and > > investment plan and look for investors. > > > > Thanks and best regards, > > > > Tim Mohlmann > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com