Hi, Someone is going to correct me if I'm wrong, but I think you misread something. The Mon-daemon doesn't need that much RAM: The 'RAM: 1 GB per daemon' is per Mon-daemon, not per OSD-daemon. The same for disk-space. You should read this page again: http://ceph.com/docs/master/install/hardware-recommendations/ Some of the other questions are answered there as well. Like how much memory does a OSD-daemon need and why/when. On Sat, May 11, 2013 at 03:42:59PM +0200, Tim Mohlmann wrote: > Hi, > > First of all I am new to ceph and this mailing list. At this moment I am > looking into the possibilities to get involved in the storage business. I am > trying to get an estimate about costs and after that I will start to determine > how to get sufficient income. > > First I will describe my case, at the bottom you will find my questions. > > > GENERAL LAYOUT: > > Part of this cost calculation is of course hardware. For the larger part I've > already figured it out. In my plans I will be leasing a full rack (46U). > Depending on the domestic needs I will be using 36 or 40U for ODS storage > servers. (I will assume 36U from here on, to keep a solid value for > calculation and have enough spare space for extra devices). > > Each OSD server uses 4U and can take 36x3.5" drives. So in 36U I can put > 36/4=9 OSD servers, containing 9*36=324 HDDs. > > > HARD DISK DRIVES > > I have been looking for WD digital RE and RED series. RE is more expensive per > GB, but has a larger MTBF and offers a 4TB model. RED is (real) cheap per GB, > but only goes as far a 3TB. > > At my current calculations it does not matter much if I would put expensive WD > RE 4TB disks or cheaper WD RED 3TB, the price per GB over the complete cluster > expense and 3 years of running costs (including AFR) is almost the same. > > So basically, if I could reduce the costs of all the other components used in > the cluster, I would go for the 3TB disk and if the costs will be higher then > my first calculation, I would use the 4TB disk. > > Let's assume 4TB from now on. So, 4*324=1296TB. So lets go Peta-byte ;). > > > NETWORK > > I will use a redundant 2x10Gbe network connection for each node. Two > independent 10Gbe switches will be used and I will use bonding between the > interfaces on each node. (Thanks some guy in the #Ceph irc for pointing this > option out). I will use VLAN's to split front-side, backside and Internet > networks. > > > OSD SERVER > > SuperMicro based, 36 HDD hotswap. Dual socket mainboard. 16x DIMM sockets. It > is advertised they can take up to 512GB of RAM. I will install 2 x Intel Xeon > E5620 2.40ghz processor, having 4 cores and 8 threads each. For the RAM I am > in doubt (see below). I am looking into running 1 OSD per disk. > > > MON AND MDS SERVERS > > Now comes the big question. What specs are required? It first I had the plan to > use 4 SuperMicro superservers, with a 4 socket mainboards that contain up to > the new 16core AMD processors and up to 1TB of RAM. > > I want all 4 of the servers to run a MON service, MDS service and costumer / > public services. Probably I would use VM's (kvm) to separate them. I will > compile my own kernel to enable Kernel Samepage Merge, Hugepage support and > memory compaction to make RAM use more efficient. The requirements for my public > services will be added up, once I know what I need for MON and MDS. > > > RAM FOR ALL SERVERS > > So what would you estimate to be the ram usage? > http://ceph.com/docs/master/install/hardware-recommendations/#minimum- > hardware-recommendations. > > Sounds OK for the OSD part. 500 MB per daemon, would put the minimum RAM > requirement for my OSD server to 18GB. 32GB should be more then enough. > Although I would like to see if it is possible to use btrfs compression? In > that case I'd need more RAM in there. > > What I really want to know: how many RAM do I need for MON and MDS servers? > 1GB per daemon sounds pretty steep. As everybody knows, RAM is expensive! > > In my case I would need at least 324 GB of ram for each of them. Initially I > was planning to use 4 servers and each of them running both. Joining those in > a single system, with the other duties the system has to perform I would need > the full 1TB of RAM. I would need to use 32GB modules witch are really > expensive per GB and difficult to find. (not may server hardware vendors in the > Netherlands have them). > > > QUESTIONS > > Question 1: Is it really the amount for OSD's that counts for MON and MDS RAM > usage, or the size of the object store? > > Question 2: can I do it with less RAM? Any statistics, or better: a > calculation? I can imagine memory pages becoming redundant if the cluster > grows, so less memory required per OSD. > > Question 3: If it is the amount of OSDs that counts, would it be beneficial to > combine disks in a RAID 0 (lvm or btrfs) array? > > Question 4: Is it safe / possible to store MON files inside of the cluster > itself? The 10GB per daemon requirement would mean I need 3240GB of storage > for each MON, meaning I need to get some huge disks and a (lvm) RAID 1 array > for redundancy, while I have a huge redundant file sytem at hand already. > > Question 5: Is it possible to enable btrfs compression? I know btrfs is not > stable for production yet, but it would be nice if compression is supported in > the future, when it does become stable > > If the RAM requirement is not so steep, I am thinking about the possibility to > run the MON service from 4 OSD servers. Upgrading them to 16x16GB of RAM would > give me 256GB of RAM. (Again, 32GB modules are to expensive and not an > option). This would obsolete 2 superservers, decreasing their workload, and > keep some spare computing power for future growth. The only reason I needed > them is for RAM capacity. > > Getting rid of 2 superservers, will provide me with enough space to fit a 10th > storage server. This will considerable reduce the total cost per GB of this > cluster. (Comparing all the hardware without the HDDs, the 4 superservers are > the most expensive part) > > I completely understand if you think: hey! that kind of things should be > corporate advise etc. Please understand I am just an individual, just working > his (non-IT) job and has Linux and open-source as a hobby. I just started > brainstorming on some business opportunities. If this story would be feasible, > I would use this information to make a business and investment plan and look > for investors. > > Thanks and best regards, > > Tim Mohlmann > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com