Am 26.03.2015 um 16:36 schrieb Mark Nelson:
I suspect a config like this where you only have 3 OSDs per node would
be more manageable than something denser.
IE theoretically a single E5-2697v3 is enough to run 36 OSDs in a 4U
super micro chassis for a semi-dense converged solution. You could
attempt to restrict the OSDs to one socket and then use a second
E5-2697v3 for VMs. Maybe after you've got cgroups setup properly and if
you've otherwise balanced things it would work out ok. I question
though how much you really benefit by doing this rather than running a
36 drive storage server with lower bin CPUs and a 2nd 1U box for VMs
(which you don't need as many of because you can dedicate both sockets
to VMs).
that's pretty big. I have only around 6-8 ssd drives per node. In case
of 36 osds per node i won't mix.
It probably depends quite a bit on how memory, network, and disk
intensive the VMs are, but my take is that it's better to error on the
side of simplicity rather than making things overly complicated. Every
second you are screwing around trying to make the setup work right eats
into any savings you might gain by going with the converged setup.
Mark
On 03/26/2015 10:12 AM, Quentin Hartman wrote:
I run a converged openstack / ceph cluster with 14 1U nodes. Each has 1
SSD (os / journals), 3 1TB spinners (1 OSD each), 16 HT cores, 10Gb NICs
for ceph network, and 72GB of RAM. I configure openstack to leave 3GB of
RAM unused on each node for OSD / OS overhead. All the VMs are backed by
ceph volumes and things generally work very well. I would prefer a
dedicated storage layer simply because it seems more "right", but I
can't say that any of the common concerns of using this kind of setup
have come up for me. Aside from shaving off that 3GB of RAM, my
deployment isn't any more complex than a split stack deployment would
be. After running like this for the better part of a year, I would have
a hard time honestly making a real business case for the extra hardware
a split stack cluster would require.
QH
On Thu, Mar 26, 2015 at 6:57 AM, Mark Nelson <mnelson@xxxxxxxxxx
<mailto:mnelson@xxxxxxxxxx>> wrote:
It's kind of a philosophical question. Technically there's nothing
that prevents you from putting ceph and the hypervisor on the same
boxes. It's a question of whether or not potential cost savings are
worth increased risk of failure and contention. You can minimize
those things through various means (cgroups, ristricting NUMA nodes,
etc). What is more difficult is isolating disk IO contention (say
if you want local SSDs for VMs), memory bus and QPI contention,
network contention, etc. If the VMs are working really hard you can
restrict them to their own socket, and you can even restrict memory
usage to the local socket, but what about remote socket network or
disk IO? (you will almost certainly want these things on the ceph
socket) I wonder as well about increased risk of hardware failure
with the increased load, but I don't have any statistics.
I'm guessing if you spent enough time at it you could make it work
relatively well, but at least personally I question how beneficial
it really is after all of that. If you are going for cost savings,
I suspect efficient compute and storage node designs will be nearly
as good with much less complexity.
Mark
On 03/26/2015 07:11 AM, Wido den Hollander wrote:
On 26-03-15 12:04, Stefan Priebe - Profihost AG wrote:
Hi Wido,
Am 26.03.2015 um 11:59 schrieb Wido den Hollander:
On 26-03-15 11:52, Stefan Priebe - Profihost AG wrote:
Hi,
in the past i rwad pretty often that it's not a good
idea to run ceph
and qemu / the hypervisors on the same nodes.
But why is this a bad idea? You save space and can
better use the
ressources you have in the nodes anyway.
Memory pressure during recovery *might* become a
problem. If you make
sure that you don't allocate more then let's say 50% for
the guests it
could work.
mhm sure? I've never seen problems like that. Currently i
ran each ceph
node with 64GB of memory and each hypervisor node with
around 512GB to
1TB RAM while having 48 cores.
Yes, it can happen. You have machines with enough memory, but
if you
overprovision the machines it can happen.
Using cgroups you could also prevent that the OSDs eat
up all memory or CPU.
Never seen an OSD doing so crazy things.
Again, it really depends on the available memory and CPU. If you
buy big
machines for this purpose it probably won't be a problem.
Stefan
So technically it could work, but memorey and CPU
pressure is something
which might give you problems.
Stefan
_________________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
<mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_________________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com