Re: 800TB - Ceph Physical Architecture Proposal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 8 Apr 2016 07:39:18 +0000 Maxime Guyot wrote:

> Hello,
> 
> On 08/04/16 04:47, "ceph-users on behalf of Christian Balzer"
> <ceph-users-bounces@xxxxxxxxxxxxxx on behalf of chibi@xxxxxxx> wrote:
> 

> >
> >> 11 OSD nodes:
> >> -SuperMicro 6047R-E1R36L
> >> --2x E5-2603v2
> >Vastly underpowered for 36 OSDs.
> >> --128GB RAM
> >> --36x 6TB OSD
> >> --2x Intel P3700 (journals)
> >Which exact model?
> >If it's the 400GB one, that's 2GB/s maximum write speed combined.
> >Slightly below what I'd expect your 36 HDDs to be able to write at about
> >2.5GB/s (36*70MB/s), but not unreasonably so.
> >However your initial network thoughts are massively overspec'ed for this
> >kind of performance.
> 
> What I have seen about OSD server sizing is:
> - 1GB of RAM per TB of OSD, 36x6TB for replicated pools
That's the standard mantra, a "safe" recommendation. 
I haven't managed to get an OSD process over 2.5GB and that was with a an
OSD that had nearly 600PGs on it and 5TB of data during a
restart/recovery. 
Several devs have in the past also commented on this. 
128GB is sufficient for his case, more RAM of course is better if you can
afford it.

> - 0.5 core or 1Ghz per OSD disk for replicated pools
That's for journal on HDD, my personal goalpost for OSDs with SSD journals
is about double that. 
Especially if I'm expecting lots of small writes, an IOPS bound cluster.

Christian
> - 1 or 2 core for SSDs
> 
> Source:
> - Minimum hardware recommendations:
> http://docs.ceph.com/docs/hammer/start/hardware-recommendations/#minimum-hardware-recommendations
> - Video (timestamp 12:00): https://www.youtube.com/watch?v=XBfYY-VhzpY
> - Slides (slide 20):
> http://www.slideshare.net/mirantis/ceph-talk-vancouver-20
> 
> So you might want to increase the RAM to around 192-256GB and the CPU to
> something like a dual 10 cores 2 Ghz (or more), E5-2660 v2 for example.
> 
> 
> 
> >
> >> 
> >> 3 MDS nodes:
> >> -SuperMicro 1028TP-DTR (one node from scale-out chassis)
> >> --2x E5-2630v4
> >> --128GB RAM
> >> --2x 120GB SSD (RAID 1 for OS)
> >Not using CephFS, but if the MDS are like all the other Ceph bits (MONs
> >in particular) they are likely to do happy writes to leveldbs or the
> >likes, do verify that.
> >If that's the case, fast and durable SSDs will be needed.
> >
> >> 
> >> 5 MON nodes:
> >> -SuperMicro 1028TP-DTR (one node from scale-out chassis)
> >> --2x E5-2630v4
> >> --128GB RAM
> >> --2x 120GB SSD (RAID 1 for OS)
> >> 
> >Total overkill, are you sure you didn't mix up the CPUs for the OSDs
> >with the ones for the MONs?
> >Also, while dedicated MONs are nice, they really can live rather
> >frugally, except for the lust for fast, durable storage.
> >If I were you, I'd get 2 dedicated MON nodes (with few, fastish cores)
> >and 32-64GB RAM, then put the other 3 on your MDS nodes which seem to
> >have plenty resources to go around.
> >You will want the dedicated MONs to have the lowest IPs in your network,
> >the monitor leader is chosen by that.
> >
> >Christian
> >> We'd use our existing Zabbix deployment for monitoring and ELK for log
> >> aggregation.
> >> 
> >> Provisioning would be through puppet-razor (PXE) and puppet.
> >> 
> >> Again, thank you for any information you can provide
> >> 
> >> --Brady
> >
> >
> >-- 
> >Christian Balzer        Network/Systems Engineer                
> >chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
> >http://www.gol.com/
> >_______________________________________________
> >ceph-users mailing list
> >ceph-users@xxxxxxxxxxxxxx
> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> Regards,
> Maxime G


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux