> On 07 Sep 2015, at 12:19, Christian Balzer <chibi@xxxxxxx> wrote: > > On Mon, 7 Sep 2015 12:11:27 +0200 Jan Schermer wrote: > >> Dense SSD nodes are not really an issue for network (unless you really >> use all the throughput), > That's exactly what I wrote... > And dense in the sense of saturating his network would be 4 SSDs, so: > >> the issue is with CPU and memory throughput >> (and possibly crappy kernel scheduler depending on how up-to-date distro >> you use). > Thats what I wrote as well, which makes smaller nodes with more CPU > resources attractive. > >> Also if you want consistent performance even when failure >> occurs, you need to either have 100% reliable SSDs, or put them in RAID >> for the journals. You don't want to rebuild all those HDD OSDs. Losing a >> journal SSD is more likely than losing a HDD these days. >> > Say what? > > My "Enterprise" HDDs are failing quite nicely, while I have yet to loose a > single Intel SSD, DC or otherwise. > All I can say is "YMMV". HDDs are much more proven technology - they die mechanicaly and you can burn them in (they usually either die shortly after put into production or from long-time wear). SSDs have many issues (and HBAs have issues with SSDs), and some of these issues occur either randomly or because of a bug (like drives failing after exactly 3 months because of some internal timer overflowing). Bottom line - HDDs are salvageable even when a firmware bug occurs or a DC spike fries them (you can swap electronics). SSDs are dead and you are SOL. I think it's prudent to consider this bottom line always, not relying on a single component... > Christian > >> Jan >> >> >>> On 07 Sep 2015, at 05:53, Christian Balzer <chibi@xxxxxxx> wrote: >>> >>> On Sat, 5 Sep 2015 07:13:29 -0300 German Anders wrote: >>> >>>> Hi Christian, >>>> >>>> Ok so would said that it's better to rearrange the nodes so i dont >>>> mix the hdd and ssd disks right? And create high perf nodes with ssd >>>> and others with hdd, its fine since its a new deploy. >>>> >>> It is what I would do, yes. >>> However if you're limited to 7 nodes initially specialized/optimized >>> nodes might result in pretty small "subclusters" and thus relatively >>> large failure domains. >>> >>> If for example this cluster would consisted of 2 SSD and 5 HDD nodes, >>> loosing 1 of the SSD nodes would roughly halve your read speed from >>> that pool (while amusingly enough improve your write speed ^o^). >>> This is assuming a replication of 2 for SSD pools, which with DC SSDs >>> is a pretty safe choice. >>> >>> Also dense SSDs nodes will be able to saturate your network easily, for >>> example 3-4 of the DC S3xxx SSDs will exceed the bandwidth of your >>> links. This is of course only an issue if you're actually expecting >>> huge amounts of reads/writes, as apposed to have lots of small >>> transactions that depend on low latency. >>> >>>> Also the nodes had different type of ram cpu, 4 had more cpu and >>>> more memory 384gb and other 3 had less cpu and 128gb of ram, so maybe >>>> i can put the ssd con the much more cpu nodes and left the hdd for >>>> the other nodes. >>> >>> I take it from this that you already have those machines? >>> Which number and models CPUs exactly? >>> >>> What you want is as MUCH CPU power for any SSD node as possible, while >>> the HDD nodes will benefit mostly from more RAM (page cache). >>> >>>> Network is going to be used infiniband fdr at 56gb/s on all the >>>> nodes for the publ network and for the clus network. >>>> >>> Is this 1 interface for the public and 1 for the cluster network? >>> Note that with IPoIB (with Accelio not being ready yet) I'm seeing at >>> most 1.5GByte/s with QDR (40Gb/s). >>> >>> If you were to start with a clean slate, I'd go with something like >>> this to achieve the storage capacity you outlined: >>> >>> * 1-2 Quad node chassis like this with 4-6 SSD ODS per node and a 2nd >>> IB HCA, or a similar product w/o onboard IB and a 2 port IB HCA: >>> http://www.supermicro.com.tw/products/system/2U/2028/SYS-2028TP-HTFR.cfm >>> That will give you 4-8 high performance SSD nodes in 2-4U. >>> >>> * 5 HDD storage nodes, with 8-10 HDDs and 2-4 journal SSDs like this: >>> http://www.supermicro.com.tw/products/system/2U/5028/SSG-5028R-E1CR12L.cfm >>> (4 100GB DC S3700 will perform better than 2 200GB ones and give you >>> smaller failure domains at about the same price). >>> >>> Christian >>> >>>> Any other suggestion/comment? >>>> >>>> Thanks a lot! >>>> >>>> Best regards >>>> >>>> German >>>> >>>> >>>> On Saturday, September 5, 2015, Christian Balzer <chibi@xxxxxxx> >>>> wrote: >>>> >>>>> >>>>> Hello, >>>>> >>>>> On Fri, 4 Sep 2015 12:30:12 -0300 German Anders wrote: >>>>> >>>>>> Hi cephers, >>>>>> >>>>>> I've the following scheme: >>>>>> >>>>>> 7x OSD servers with: >>>>>> >>>>> Is this a new cluster, total initial deployment? >>>>> >>>>> What else are these nodes made of, CPU/RAM/network? >>>>> While uniform nodes have some appeal (interchangeability, one node >>>>> down does impact the cluster uniformly) they tend to be compromise >>>>> solutions. I personally would go with optimized HDD and SSD nodes. >>>>> >>>>>> 4x 800GB SSD Intel DC S3510 (OSD-SSD) >>>>> Only 0.3DWPD, 450TB total in 5 years. >>>>> If you can correctly predict your write volume and it is below that >>>>> per SSD, fine. I'd use 3610s, with internal journals. >>>>> >>>>>> 3x 120GB SSD Intel DC S3500 (Journals) >>>>> In this case even more so the S3500 is a bad choice. 3x 135MB/s is >>>>> nowhere near your likely network speed of 10Gb/s. >>>>> >>>>> You will vastly superior performance and endurance with two 200GB >>>>> S3610 (2x 230MB/s) or S3700 (2x365 MB/s) >>>>> >>>>> Why the uneven number of journals SSDs? >>>>> You want uniform utilization, wear. 2 journal SSDs for 6 HDDs would >>>>> be a good ratio. >>>>> >>>>>> 5x 3TB SAS disks (OSD-SAS) >>>>>> >>>>> See above, even numbers make a lot more sense. >>>>> >>>>>> >>>>>> The OSD servers are located on two separate Racks with two power >>>>>> circuits each. >>>>>> >>>>>> I would like to know what is the best way to implement this.. use >>>>>> the 4x 800GB SSD like a SSD-pool, or used them us a Cache pool? or >>>>>> any other suggestion? Also any advice for the crush design? >>>>>> >>>>> Nick touched on that already, for right now SSD pools would be >>>>> definitely better. >>>>> >>>>> Christian >>>>> -- >>>>> Christian Balzer Network/Systems Engineer >>>>> chibi@xxxxxxx <javascript:;> Global OnLine Japan/Fusion >>>>> Communications >>>>> http://www.gol.com/ >>>>> >>>> >>>> >>> >>> >>> -- >>> Christian Balzer Network/Systems Engineer >>> chibi@xxxxxxx Global OnLine Japan/Fusion Communications >>> http://www.gol.com/ >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > > > -- > Christian Balzer Network/Systems Engineer > chibi@xxxxxxx Global OnLine Japan/Fusion Communications > http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com