Re: 6 Node cluster with 24 SSD per node: Hardware planning / agreement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

thanks for take a look :-)

Am 04.10.2016 16:11, schrieb Nick Fisk:

We have two goals:

* High availability
* Short latency for our transaction services

How Low? See below re CPU's

so low, what is possible without doing crazy stuff. We thinking to put the database on CEPH too, instead of local SSDs on separated servers.


via API so a separated meta server isn't needed, as I understand all the documents right.

Meta data server is for CephFS (Distributed Filesystem) for direct
librados library calls or RBD (Block Devices) you need only mon's
and osd's.

perfect :-)


All nodes are connected over cross to every switch, so if one switch goes down, a second path is available.

Isn't that a 1GB switch with a couple of 10G modules? Any reason you
can't get a pure 10G switch?

right. The reason is, we have them already with stack modules and dual power supply. So we need only the 10Gbit backplane modules. That's it.


* Disk:
** Storage: 24 x Crucial MX300 250GB (maybe for production 12xSSD / 12x
big Sata disks)

I would be very careful about using these. They are not enterprise
SSD's. I would go for either S3610 or S3510 if you will be doing
mainly reads.

there was a long long discussion (also here on the list ...) I would also prefer enterprise SSDs ... but they are too expensive ... maybe for storage we could use the Samsung 850pro series or what fits in the same price region. I would personally use SSDs with power loss protection so the Intel S3510 / S37xx fits also and is on a second buy list.


** OSD journal: 1 x Intel SSD DC P3700 PCIe

That will not be enough to journal 24x SSD's. Or is this just for the
SATA disks and SSD's have no journals? which in case it will
be fine.

hmm, Ok, I was nearly sure to hit this question ..... yes it would be for all journals. If you would say, we don't need it if we put the OSD journals on each OSD ...

We would use the 400GB DC P3700 PCIe edition for journals.
Otherwise I would read between the lines, we need two of them to carry all the journals from all SSD drives.

really needed in our case. Sure, the cache is one of the benefits, but
maybe it is more complicated, than a plain HBA.

Yeah, RAID controllers can sometimes increase performance slightly due
to write back cache, but they can also get overwhelmed and
end up being slower. Especially with SSD's you are probably best with plain HBA.

great to hear :-)

The OS would be Proxmox 4.x (based on Debian Jessie) with Hammer or
Jewel, but WITHOUT ANY VMs on it. We want to keep the systems are in one
hand :-)

Why are you going to run Proxmox with no VM's just for Ceph? What's
wrong with just Ubuntu or Debian?

Proxmox would be become our main hypervisor and Ceph is builtin technology with all kind of stuff, which is needed. So in the end we have 6 OSD nodes and 4 hypervisor, all under the "umbrella" from Proxmox. So the documentation and maintenance is much easier as it is based on one plattform.

So we want to know, the hardware should be O.K also with running the mon
servers on the same HW, like the OSDs. We know, that every OSD should
own a core, so the 2620v4 has 8 cores, with HT 16 and in sum we have 32
CPUs per OSD node, which should be fine, .... I think ....

I would play less attention to the number of cores + osd's, instead
look at the total number of Ghz and number of IOPs you require.
I have been doing some testing recently and have come up a figure of
around 1Mhz per IO. I will be writing up a blog article with
more details in the near future.

If you need low number of IO's but with low latency, I would go with
lower number of cores with very fast cores (3.5Ghz+). Otherwise
if you think you will be generating 100's thousands of IO's then you
probably want more cores and will have to take the increased
latency due to slower cores as a compromise.

that is extremely nice to know !! Most of documentation is based on the core, but not the plain Mhz.


thank you for the comments :-)

cu denny


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux