deployment architecture practices / new ideas?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We're looking to deploy CEPH on about 8 Dell servers to start, each of which typically contain 6 to 8 harddisks with Perc RAID controllers which support write-back cache (~512 MB usually). Most machines have between 32 and 128 GB RAM. Our questions are as follows. Please feel free to comment on even just one of the questions below if that's the area of your expertise/interest.

  1. Based on various "best practice" guides, they suggest putting the OS on a separate disk. But, we though that would not be good because we'd sacrifice a whole disk on each machine (~3 TB) or even two whole disks (~6 TB) if we did a hardware RAID 1 on it. So, do people normally just sacrifice one whole disk? Specifically, we came up with this idea:
    1. We set up all hard disks as "pass-through" in the raid controller, so that the RAID controller's cache is still in effect, but the OS sees just a bunch of disks (6 to 8 in our case)
    2. We then do a SOFTWARE-baised RAID 1 (using Centos 6.4) for the OS across all 6 to 8 hardisks
    3. We then do a SOFTWARE-baised RAID 0 (using Centos 6.4) for the SWAP space.
    4. Does anyone see any flaws in our idea above? We think that RAID 1 is not computationally expensive for the machines to computer, and most of the time, the OS should be in RAM. Similarly, we think RAID 0 should be easy for the CPU to compute, and hopefully, we won't hit much SWAP if we have enough RAM. And this way, we don't sacrific 1 or 2 whole disks for just the OS.
  2. Based on the performance benchmark blog of Marc Nelson (http://ceph.com/community/ceph-performance-part-2-write-throughput-without-ssd-journals/), has anything substantially changed since then? Specifically, it suggests that SSDs may not be really necessary if one has raid controllers with write-back cache. Is this still true even though the article was written with a version of CEPH that was over 1 year old? (Marc suggests that things may change with newer versions of CEPH)
  3. Based on our understanding, it would seem that CEPH can deliver very high throughput performance (especially for reads) if dozens and dozeons of hard disks are being accessed simultaneously across multiple machines. So, we could have several GBs throughput, right? (CEPH never advertises the advantage of read throughput with distributed architecture, so I'm wondering if I'm missing something.) If so, then is it reasonable to assume that one common bottleneck is the ethernet? So if we only use 1 NIC card at 1 GBs, that'll be a major bottleneck? If so, we're thinking of trying to "bond" multiple 1 GB/s ethernet cards to make a "bonded" ethernet connection of 4 GBs (4 * 1 GB/s). But we didn't see anyone discuss this strategy? Is there any holes in it? Or does CEPH "automatically" take advantage of multiple NIC cards without us having to deal with the complexity (and expense of buying a new switch which supports bonding) for doing bonding? That is, is it possible and a good idea to have CEPH OSDs be set up to use specific NICs, so that we spread the load? (We read through the recommendation of having different NICs for front-end traffic vs back-end traffic, but we're not worried about network attacks -- so we're thinking that just creating a "big" fat ethernet pipe gives us the most flexibility.)
  4. I'm a little confused -- does CEPH support incremental snapshots of either VMs or the CEPH-FS? I saw in the release notes for "dumpling" release (http://ceph.com/docs/master/release-notes/#v0-67-dumpling) this statement: "The MDS now disallows snapshots by default as they are not considered stable. The command ‘ceph mds set allow_snaps’ will enable them." So, should I assume that we can't do incremental file-system snapshots in a stable fashion until further notice?
-Sidharta
--
Gautam Saxena
President & CEO
Integrated Analysis Inc.

Making Sense of Data.
Biomarker Discovery Software | Bioinformatics Services | Data Warehouse Consulting | Data Migration Consulting
www.i-a-inc.com 
gsaxena@xxxxxxxxxxx
(301) 760-3077  office
(240) 479-4272  direct
(301) 560-3463  fax
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux