Re: Is ceph itself a single point of failure?

Marius Leustean <marius.leus@xxxxxxxxx> · Mon, 22 Nov 2021 12:39:49 +0200

> I do not know what you mean by this, you can tune this with your min size
and replication. It is hard to believe that exactly harddrives fail in the
same pg. I wonder if this is not more related to your 'non-default' config?

In my setup size=2 and min_size=1. I had cases when 1 PG being stuck in
peering state was causing all the VMs in that pool to not get any I/O. My
setup is really "default", deployed with minimal config changes derived
from ceph-ansible and with even number of OSDs per host.

> That is also very hard to believe, since I am updating ceph and reboot
one node at time, which is just going fine.

Real case: host goes down, individual OSDs from other hosts started
consuming >100GB RAM during backfill and get OOM-killed (but hey,
documentation says that "provisioning ~8GB per BlueStore OSD is advised.")

> If you would read and investigate, you would not need to ask this
question.

I was thinking of getting insights on other people's environments, thus
asking questions :)

> Is your lack of knowledge of ceph maybe a critical issue?

I'm just that poor guy reading and understanding the official documentation
and lists, but getting hit by the real world ceph.

On Mon, Nov 22, 2021 at 12:23 PM Marc <Marc@xxxxxxxxxxxxxxxxx> wrote:

> >
> > Many of us deploy ceph as a solution to storage high-availability.
> >
> > During the time, I've encountered a couple of moments when ceph refused
> > to
> > deliver I/O to VMs even when a tiny part of the PGs were stuck in
> > non-active states due to challenges on the OSDs.
>
> I do not know what you mean by this, you can tune this with your min size
> and replication. It is hard to believe that exactly harddrives fail in the
> same pg. I wonder if this is not more related to your 'non-default' config?
>
> > So I found myself in very unpleasant situations when an entire cluster
> > went
> > down because of 1 single node, even if that cluster was supposed to be
> > fault-tolerant.
>
> That is also very hard to believe, since I am updating ceph and reboot one
> node at time, which is just going fine.
>
> >
> > Regardless of the reason, the cluster itself can be a single point of
> > failure, even if it's has a lot of nodes.
>
> Indeed, like the data center, and like the planet. The question you should
> ask yourself, do you have a better alternative? For the 3-4 years I have
> been using ceph, I did not find a better alternative (also not looking for
> it ;))
>
> > How do you segment your deployments so that your business doesn't
> > get jeopardised in the case when your ceph cluster misbehaves?
> >
> > Does anyone even use ceph for a very large clusters, or do you prefer to
> > separate everything into smaller clusters?
>
> If you would read and investigate, you would not need to ask this
> question.
> Is your lack of knowledge of ceph maybe a critical issue? I know the ceph
> organization likes to make everything as simple as possible for everyone.
> But this has of course its flip side when users run into serious issues.
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx