Hi,
thanks for take a look :-)
Am 04.10.2016 16:11, schrieb Nick Fisk:
We have two goals:
* High availability
* Short latency for our transaction services
How Low? See below re CPU's
so low, what is possible without doing crazy stuff. We thinking to put
the database on CEPH too, instead of local SSDs on separated servers.
via API so a separated meta server isn't needed, as I understand all
the documents right.
Meta data server is for CephFS (Distributed Filesystem) for direct
librados library calls or RBD (Block Devices) you need only mon's
and osd's.
perfect :-)
All nodes are connected over cross to every switch, so if one switch
goes down, a second path is available.
Isn't that a 1GB switch with a couple of 10G modules? Any reason you
can't get a pure 10G switch?
right. The reason is, we have them already with stack modules and dual
power supply. So we need only the 10Gbit backplane modules. That's it.
* Disk:
** Storage: 24 x Crucial MX300 250GB (maybe for production 12xSSD /
12x
big Sata disks)
I would be very careful about using these. They are not enterprise
SSD's. I would go for either S3610 or S3510 if you will be doing
mainly reads.
there was a long long discussion (also here on the list ...) I would
also prefer enterprise SSDs ... but they are too expensive ... maybe for
storage we could use the Samsung 850pro series or what fits in the same
price region. I would personally use SSDs with power loss protection so
the Intel S3510 / S37xx fits also and is on a second buy list.
** OSD journal: 1 x Intel SSD DC P3700 PCIe
That will not be enough to journal 24x SSD's. Or is this just for the
SATA disks and SSD's have no journals? which in case it will
be fine.
hmm, Ok, I was nearly sure to hit this question ..... yes it would be
for all journals. If you would say, we don't need it if we put the OSD
journals on each OSD ...
We would use the 400GB DC P3700 PCIe edition for journals.
Otherwise I would read between the lines, we need two of them to carry
all the journals from all SSD drives.
really needed in our case. Sure, the cache is one of the benefits, but
maybe it is more complicated, than a plain HBA.
Yeah, RAID controllers can sometimes increase performance slightly due
to write back cache, but they can also get overwhelmed and
end up being slower. Especially with SSD's you are probably best with
plain HBA.
great to hear :-)
The OS would be Proxmox 4.x (based on Debian Jessie) with Hammer or
Jewel, but WITHOUT ANY VMs on it. We want to keep the systems are in
one
hand :-)
Why are you going to run Proxmox with no VM's just for Ceph? What's
wrong with just Ubuntu or Debian?
Proxmox would be become our main hypervisor and Ceph is builtin
technology with all kind of stuff, which is needed. So in the end we
have 6 OSD nodes and 4 hypervisor, all under the "umbrella" from
Proxmox.
So the documentation and maintenance is much easier as it is based on
one plattform.
So we want to know, the hardware should be O.K also with running the
mon
servers on the same HW, like the OSDs. We know, that every OSD should
own a core, so the 2620v4 has 8 cores, with HT 16 and in sum we have
32
CPUs per OSD node, which should be fine, .... I think ....
I would play less attention to the number of cores + osd's, instead
look at the total number of Ghz and number of IOPs you require.
I have been doing some testing recently and have come up a figure of
around 1Mhz per IO. I will be writing up a blog article with
more details in the near future.
If you need low number of IO's but with low latency, I would go with
lower number of cores with very fast cores (3.5Ghz+). Otherwise
if you think you will be generating 100's thousands of IO's then you
probably want more cores and will have to take the increased
latency due to slower cores as a compromise.
that is extremely nice to know !! Most of documentation is based on the
core, but not the plain Mhz.
thank you for the comments :-)
cu denny
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com