On Mon, 7 Sep 2015 12:11:27 +0200 Jan Schermer wrote: > Dense SSD nodes are not really an issue for network (unless you really > use all the throughput), That's exactly what I wrote... And dense in the sense of saturating his network would be 4 SSDs, so: > the issue is with CPU and memory throughput > (and possibly crappy kernel scheduler depending on how up-to-date distro > you use). Thats what I wrote as well, which makes smaller nodes with more CPU resources attractive. > Also if you want consistent performance even when failure > occurs, you need to either have 100% reliable SSDs, or put them in RAID > for the journals. You don't want to rebuild all those HDD OSDs. Losing a > journal SSD is more likely than losing a HDD these days. > Say what? My "Enterprise" HDDs are failing quite nicely, while I have yet to loose a single Intel SSD, DC or otherwise. Christian > Jan > > > > On 07 Sep 2015, at 05:53, Christian Balzer <chibi@xxxxxxx> wrote: > > > > On Sat, 5 Sep 2015 07:13:29 -0300 German Anders wrote: > > > >> Hi Christian, > >> > >> Ok so would said that it's better to rearrange the nodes so i dont > >> mix the hdd and ssd disks right? And create high perf nodes with ssd > >> and others with hdd, its fine since its a new deploy. > >> > > It is what I would do, yes. > > However if you're limited to 7 nodes initially specialized/optimized > > nodes might result in pretty small "subclusters" and thus relatively > > large failure domains. > > > > If for example this cluster would consisted of 2 SSD and 5 HDD nodes, > > loosing 1 of the SSD nodes would roughly halve your read speed from > > that pool (while amusingly enough improve your write speed ^o^). > > This is assuming a replication of 2 for SSD pools, which with DC SSDs > > is a pretty safe choice. > > > > Also dense SSDs nodes will be able to saturate your network easily, for > > example 3-4 of the DC S3xxx SSDs will exceed the bandwidth of your > > links. This is of course only an issue if you're actually expecting > > huge amounts of reads/writes, as apposed to have lots of small > > transactions that depend on low latency. > > > >> Also the nodes had different type of ram cpu, 4 had more cpu and > >> more memory 384gb and other 3 had less cpu and 128gb of ram, so maybe > >> i can put the ssd con the much more cpu nodes and left the hdd for > >> the other nodes. > > > > I take it from this that you already have those machines? > > Which number and models CPUs exactly? > > > > What you want is as MUCH CPU power for any SSD node as possible, while > > the HDD nodes will benefit mostly from more RAM (page cache). > > > >> Network is going to be used infiniband fdr at 56gb/s on all the > >> nodes for the publ network and for the clus network. > >> > > Is this 1 interface for the public and 1 for the cluster network? > > Note that with IPoIB (with Accelio not being ready yet) I'm seeing at > > most 1.5GByte/s with QDR (40Gb/s). > > > > If you were to start with a clean slate, I'd go with something like > > this to achieve the storage capacity you outlined: > > > > * 1-2 Quad node chassis like this with 4-6 SSD ODS per node and a 2nd > > IB HCA, or a similar product w/o onboard IB and a 2 port IB HCA: > > http://www.supermicro.com.tw/products/system/2U/2028/SYS-2028TP-HTFR.cfm > > That will give you 4-8 high performance SSD nodes in 2-4U. > > > > * 5 HDD storage nodes, with 8-10 HDDs and 2-4 journal SSDs like this: > > http://www.supermicro.com.tw/products/system/2U/5028/SSG-5028R-E1CR12L.cfm > > (4 100GB DC S3700 will perform better than 2 200GB ones and give you > > smaller failure domains at about the same price). > > > > Christian > > > >> Any other suggestion/comment? > >> > >> Thanks a lot! > >> > >> Best regards > >> > >> German > >> > >> > >> On Saturday, September 5, 2015, Christian Balzer <chibi@xxxxxxx> > >> wrote: > >> > >>> > >>> Hello, > >>> > >>> On Fri, 4 Sep 2015 12:30:12 -0300 German Anders wrote: > >>> > >>>> Hi cephers, > >>>> > >>>> I've the following scheme: > >>>> > >>>> 7x OSD servers with: > >>>> > >>> Is this a new cluster, total initial deployment? > >>> > >>> What else are these nodes made of, CPU/RAM/network? > >>> While uniform nodes have some appeal (interchangeability, one node > >>> down does impact the cluster uniformly) they tend to be compromise > >>> solutions. I personally would go with optimized HDD and SSD nodes. > >>> > >>>> 4x 800GB SSD Intel DC S3510 (OSD-SSD) > >>> Only 0.3DWPD, 450TB total in 5 years. > >>> If you can correctly predict your write volume and it is below that > >>> per SSD, fine. I'd use 3610s, with internal journals. > >>> > >>>> 3x 120GB SSD Intel DC S3500 (Journals) > >>> In this case even more so the S3500 is a bad choice. 3x 135MB/s is > >>> nowhere near your likely network speed of 10Gb/s. > >>> > >>> You will vastly superior performance and endurance with two 200GB > >>> S3610 (2x 230MB/s) or S3700 (2x365 MB/s) > >>> > >>> Why the uneven number of journals SSDs? > >>> You want uniform utilization, wear. 2 journal SSDs for 6 HDDs would > >>> be a good ratio. > >>> > >>>> 5x 3TB SAS disks (OSD-SAS) > >>>> > >>> See above, even numbers make a lot more sense. > >>> > >>>> > >>>> The OSD servers are located on two separate Racks with two power > >>>> circuits each. > >>>> > >>>> I would like to know what is the best way to implement this.. use > >>>> the 4x 800GB SSD like a SSD-pool, or used them us a Cache pool? or > >>>> any other suggestion? Also any advice for the crush design? > >>>> > >>> Nick touched on that already, for right now SSD pools would be > >>> definitely better. > >>> > >>> Christian > >>> -- > >>> Christian Balzer Network/Systems Engineer > >>> chibi@xxxxxxx <javascript:;> Global OnLine Japan/Fusion > >>> Communications > >>> http://www.gol.com/ > >>> > >> > >> > > > > > > -- > > Christian Balzer Network/Systems Engineer > > chibi@xxxxxxx Global OnLine Japan/Fusion Communications > > http://www.gol.com/ > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com