Re: Why you might want packages not containers for Ceph deployments

Erik Lindahl <erik.lindahl@xxxxxxxxx> · Tue, 17 Aug 2021 15:52:04 +0200

Hi,

I figured I should follow up on this discussion, not with the intention of
bashing any particular solution, but pointing to at least one current major
challenge with cephadm.

As I wrote earlier in the thread, we previously found it ... challenging to
debug things running in cephadm. Earlier this week it appears we too were
hit by the bug where cephadm removes monitors from the monmap (
https://tracker.ceph.com/issues/51027 ) if the node is rebooted.

Presently our cluster is offline, because there's still no fix, and every
single piece of documentation for things like monmaptool appears to assume
it's running natively, not through cephadm. There's also the additional
fragility that all the "ceph orch" commands themselves stop working (even a
simple status request just hangs) if the ceph cluster itself is down.  I
suspect we'll find ways around that, but when reflecting I have a few
thoughts:

1. It is significantly harder than one thinks to develop a stable
orchestrating environment. We've been happy with both salt & ansible, but
on balance cephadm appears quite fragile - and I'm not sure if it will ever
be realistic to invest the amount of work required to make it as stable.
There are of course many advantages to having something closely tied to the
specific solution (ceph) - but in hindsight that seems to only have been an
advantage in sunny weather. Once the service itself is down, I think it is
a clear & major drawback that suddenly your orchestrator also stops
responding. Long-term, if cephadm is the solution, I think it's important
that it works even when the ceph services themselves are down.

2. I think ceph - in particular the documentation - suffers from too many
different ways of doing things (raw packages, or rook, or cephadm, which in
turn can use either docker or podman, etc.), which again is a pain the
second you need to debug or fix anything.  If the decision is that cephadm
is the way things should work, so be it, but then all documentation has to
actually reflect how to do things in a cephadm environment (and not e.g.
assuming all the containers are running so you can log in to the right
container first). How do you extract a monman in a cephadm cluster, for
instance? Just following the default documentation produces errors.
Presently I feel the short-term solution has been to allow multiple
different ways of doing things. As a developer I can understand that, but
as a user it's a nightmare unless somebody takes the time to properly
update all documentation with two (or more) choices describing how to do
things (a) natively, or (b) in a cephadm cluster.

Again, this is meant as hopefully constructive feedback rather than
complaints, but the feeling a get after having had fairly smooth operations
with raw packages (including fixing previous bugs leading to severe
crashes) and lately grinding our teeth a bit over cephadm is that it has
helped automated a bunch of stuff that wasn't particularly difficult (it's
nice to issue an update with a single command, but it works perfectly fine
manually too) at the cost of making it WAY more difficult to fix things
(not to mention simply get information about the cluster) when we have
problems - and in the long run that's not a trade-off I'm entirely happy
with :-)

Cheers,

Erik

On Tue, Jun 29, 2021 at 1:25 AM Sage Weil <sage@xxxxxxxxxxxx> wrote:

> On Fri, Jun 25, 2021 at 10:27 AM Nico Schottelius
> <nico.schottelius@xxxxxxxxxxx> wrote:
> > Hey Sage,
> >
> > Sage Weil <sage@xxxxxxxxxxxx> writes:
> > > Thank you for bringing this up.  This is in fact a key reason why the
> > > orchestration abstraction works the way it does--to allow other
> > > runtime environments to be supported (FreeBSD!
> > > sysvinit/Devuan/whatever for systemd haters!)
> >
> > I would like you to stop labeling people who have reasons for not using
> > a specific software as haters.
> >
> > It is not productive to call Ceph developers "GlusterFS haters", nor to
> > call Redhat users Debian haters.
> >
> > It is simple not an accurate representation.
>
> You're right, and I apologize.  My intention was to point out that we
> tried to keep the door open to everyone, even those who might be
> called "haters", but I clearly missed the mark.
>
> sage
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

-- 
Erik Lindahl <erik.lindahl@xxxxxxxxx>
Professor of Biophysics, Dept. Biochemistry & Biophysics, Stockholm
University
Science for Life Laboratory, Box 1031, 17121 Solna, Sweden

Note: I frequently do email outside office hours because it is a convenient
time for me to write, but please do not interpret that as an expectation
for you to respond outside your work hours.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx