Re: 6 Node cluster with 24 SSD per node: Hardwareplanning/ agreement

Wido den Hollander <wido@xxxxxxxx> · Mon, 10 Oct 2016 19:13:19 +0200 (CEST)

> Op 10 oktober 2016 om 14:56 schreef Matteo Dacrema <mdacrema@xxxxxxxx>:
> 
> 
> Hi,
> 
> I’m planning a similar cluster.
> Because it’s a new project I’ll start with only 2 node cluster witch each:
> 

2 nodes in a Ceph cluster is way to small in my opinion.

I suggest that you take a lot more smaller nodes with let's say 4 SSDs each instead of two big machines with 24 SSDs. It's just asking for trouble with just 2 nodes.

Wido

> 2x E5-2640v4 with 40 threads total @ 3.40Ghz with turbo
> 24x 1.92 TB Samsung SM863 
> 128GB RAM
> 3x LSI 3008 in IT mode / HBA for OSD - 1 each 8 OSD/SDDs
> 2x SSD for OS
> 2x 40Gbit/s NIC
> 
> 
> What about this hardware configuration? Is that wrong or I’m missing something ?
> 
> Regards
> Matteo
> 
> > Il giorno 06 ott 2016, alle ore 13:52, Denny Fuchs <linuxmail@xxxxxxxx> ha scritto:
> > 
> > God morning,
> > 
> >>> * 2 x SN2100 100Gb/s Switch 16 ports
> >> Which incidentally is a half sized (identical HW really) Arctica 3200C.
> >  
> > really never heart from them :-) (and didn't find any price €/$ region)
> >  
> > 
> >>> * 10 x ConnectX 4LX-EN 25Gb card for hypervisor and OSD nodes
> > [...]
> > 
> >> You haven't commented on my rather lengthy mail about your whole design,
> >> so to reiterate:
> >  
> > maybe accidentally skipped, so much new input  :-) sorry
> > 
> >> The above will give you a beautiful, fast (but I doubt you'll need the
> >> bandwidth for your DB transactions), low latency and redundant network
> >> (these switches do/should support MC-LAG). 
> >  
> > Jepp, they do MLAG (with the 25Gbit version of the cx4 NICs)
> >  
> >> In more technical terms, your network as depicted above can handle under
> >> normal circumstances around 5GB/s, while your OSD nodes can't write more
> >> than 1GB/s.
> >> Massive, wasteful overkill.
> >  
> > before we started with planing Ceph / new hypervisor design, we where sure that our network would be more powerful, than we need in the near future. Our applications / DB never used the full 1GBs in any way ...  we loosing speed on the plain (painful LANCOM) switches and the applications (mostly Perl written in the beginning of the 2005).
> > But anyway, the network should be have enough capacity for the next years, because it is much more complicated to change network (design) components, than to kick a node.
> >  
> >> With a 2nd NVMe in there you'd be at 2GB/s, or simple overkill.
> >  
> > We would buy them ... so that in the end, every 12 disk has a separated NVMe
> > 
> >> With decent SSDs and in-line journals (400GB DC S3610s) you'd be at 4.8
> >> GB/s, a perfect match.
> >  
> > What about the worst case, two nodes are broken, fixed and replaced ? I red (a lot) that some Ceph users had massive problems, while the rebuild runs. 
> >  
> > 
> >> Of course if your I/O bandwidth needs are actually below 1GB/s at all times
> >> and all your care about is reducing latency, a single NVMe journal will be
> >> fine (but also be a very obvious SPoF).
> > 
> > Very happy  to put the finger in the wound, SPof ... is a very hard thing ... so we try to plan everything redundant  :-)
> >  
> > The bad side of life: the SSD itself. A consumer SSD lays round about 70/80€, a DC SSD jumps up to 120-170€. My nightmare is: a lot of SSDs are jumping over the bridge at the same time .... -> arghh 
> >  
> > But, we are working on it :-)
> >  
> > I've searching an alternative for the Asus board with more PCIe slots and maybe some components; better CPU with 3.5Ghz-> ; maybe a mix from the SSDs ...
> >  
> > At this time, I've found the X10DRi:
> >  
> > https://www.supermicro.com/products/motherboard/xeon/c600/x10dri.cfm <https://www.supermicro.com/products/motherboard/xeon/c600/x10dri.cfm>
> >  
> > and I think we use the E5-2637v4 :-)
> >  
> >  cu denny
> >  
> > 
> > -- 
> > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto. 
> > Clicca qui per segnalarlo come spam. <http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=0E9124029A.A17D5> 
> > Clicca qui per metterlo in blacklist <http://mx01.enter.it/cgi-bin/learn-msg.cgi?blacklist=1&id=0E9124029A.A17D5>_______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com