Re: Why you might want packages not containers for Ceph deployments

Francois Legrand <fleg@xxxxxxxxxxxxxx> · Mon, 8 Nov 2021 17:59:38 +0100

Hi Franck,

I totally agree with your point 3 (also with 1 and 2 indeed). Generally 
speaking, the release cycle of many softwares tends to become faster and 
faster (not only for ceph, but also openstack etc...) and it's really 
hard and tricky to maintain an infrastructure up to date in such 
conditions, even more when you deal with storage. As a result, as you 
perfectly explained it, this gives the impression that the product is 
not that robust, contains a lot of bugs and needs a lot of patches etc. 
Few times upgrades had been released with obvious bugs or regressions 
(e.g DNS problem in 14.2.12,...) and this gives the impression that 
there is an urge to release, even if the corrections are not totally 
tested... which lead to a loose of confidence from the users.

And I am personally in this process !! We wanted to upgrade our Nautilus 
cluster. First we decided to go directly to Pacific, but looking to the 
list it appears to us that Pacific is absolutely not stable enough to be 
considered as a production release. We thus decided to go to octopus... 
maybe we will go to pacific when the v17 will be out.

I thus feel that the "last stable release" (currently pacific) is in 
fact a development release (and the community is the "testing pool" for 
that release) and the truly stable release is the n-1 one (octopus). 
Thus I am fully supporting your request for a LTS release with stability 
as a main goal.

F.

Le 08/11/2021 à 13:21, Frank Schilder a écrit :
Hi all,

I followed this thread with great interest and would like to add my opinion/experience/wishes as well.

I believe the question packages versus containers needs a bit more context to be really meaningful. This was already mentioned several times with regards to documentation. I see the following three topics tightly connected (my opinion/answers included):

1. Distribution: Packages are compulsory, containers are optional.
2. Deployment: Ceph adm (yet another deployment framework) and ceph (the actual storage system) should be strictly different projects.
3. Release cycles: The release cadence is way too fast, I very much miss a ceph LTS branch with at least 10 years back-port support.

These are my short answers/wishes/expectations in this context. I will add below some more reasoning as optional reading (warning: wall of text ahead).

1. Distribution
---------

I don't think the question is about packages versus containers, because even if a distribution should decide not to package ceph any more, other distributors certainly will and the user community just moves away from distributions without ceph packages. In addition, unless Rad Hat plans to move to a source-only container where I run the good old configure - make - make install, it will be package based any ways, so packages are there to stay.

Therefore, the way I understand this question is about ceph-adm versus other deployment methods. Here, I think the push to a container-based ceph-adm only deployment is unlikely to become the no. 1 choice for everyone for good reasons already mentioned in earlier messages. In addition, I also believe that development of a general deployment tool is currently not sustainable as was mentioned by another user. My reasons for this are given in the next section.

2. Deployment
---------

In my opinion, it is really important to distinguish three components of any open-source project: development (release cycles), distribution and deployment. Following the good old philosophy that every tool does exactly one job and does it well, each of these components are separate projects, because they correspond to different tools.

This implies immediately that ceph documentation should not contain documentation about packaging and deployment tools. Each of these ought to be strictly separate. If I have a low-level problem with ceph and go to the ceph documentation, I do not want to see ceph-adm commands. Ceph documentation should be about ceph (the storage system) only. Such a mix-up is leading to problems and there were already ceph-user cases where people could not use the documentation for trouble shooting, because it showed ceph-adm commands but their cluster was not ceph-adm deployed.

In this context, I would prefer if there was a separate ceph-adm-users list so that ceph-users can focus on actual ceph problems again.

Now to the point that ceph-adm might be an un-sustainable project. Although at a first glance the idea of a generic deployment tool that solves all problems with a single command might look appealing, it is likely doomed to fail for a simple reason that was already indicated in an earlier message: ceph deployment is subject to a complexity paradox. Ceph has a very large configuration space and implementing and using a generic tool that covers and understands this configuration space is more complex than deploying any specific ceph cluster, each of which uses only a tiny subset of the entire configuration space.

In other words: deploying a specific ceph cluster is actually not that difficult.

Designing a - and dimensioning all components of a ceph cluster is difficult and none of the current deployment tools help here. There is not even a check for suitable hardware. In addition, technology is moving fast and adapting a generic tool to new developments in time seems a hopeless task. For example, when will ceph-adm natively support collocated lvm OSDs with dm_cache devices? Is it even worth trying to incorporate this?

My wish would be to keep the ceph project clean of any deployment tasks. In my opinion, the basic ceph tooling is already doing tasks that are the responsibility of a configuration management- and not a storage system (e.g. deploy unit files by default instead of as an option disabled by default).

3. Release cycles
---------

Ceph is a complex system and the code is getting more complex every day. It is very difficult to beat the curse of complexity that development and maintenance effort grows non-linearly (exponentially?) with the number of lines of code. As a consequence, (A) if one wants to maintain quality while adding substantial new features, the release intervals become longer and longer. (B) If one wants to maintain constant release intervals while adding substantial new features, the quality will have to go down. The last option is that (C) new releases with constant release intervals contain ever smaller increments in functionality to maintain quality. I ignore the option of throwing more and more qualified developers at the project as this seems unlikely and also comes with its own complexity cost.

I'm afraid we are in scenario B. Ceph is loosing its nimbus of being a rock solid system.

Just recently, there were some ceph-user emails about how dangerous or not is it to upgrade to the latest stable octopus version. The upgrade itself apparently goes well, but what happens then? I personally have too many reports that the latest ceph versions are quite touchy and collapse in situations that have never been a problem up to mimic (most prominently, that a simple rebalance operation after adding disks gets OSDs to flap and can take a whole cluster down - plenty of cases since nautilus). Stability at scale seems to become a real issue with increasing version numbers. I'm myself very hesitant to upgrade, in particular, because there is no way back and the cycles of potential doom are so short.

Therefore, I would very much appreciate the foundation of a ceph-LTS branch with at least 10 years back-port support, if not longer. In addition, upgrade procedures between LTS versions should allow a downgrade by one version as well (move legacy data along until explicitly allowed to cut all bridges). For any large storage system, robustness, predictability and low maintenance effort are invaluable. For example, our cluster is very demanding compared with our other storage systems, the OSDs have a nasty memory leak, operations get stuck in MONs and MDSes at least once or twice a week due to race conditions and so on. It is currently not possible to let the cluster run unattended for months or even years, something that is possible if not the rule with other (also open-source) storage systems.

Fixing bugs that show up rarely and are very difficult to catch is really important for a storage system with theoretically infinite uptime. Rolling versions over all the time and then throwing "xyz is not supported, try with a newer version" at users when they discover a rare a problem after running for a few years is not helping to get ceph to a level of stability that will be convincing enough in the long run.

I understand that implementing new features is more fun than bug fixing. However, bug fixing is what makes users trust a platform. I see too many people around me loosing faith in ceph at the moment and starting to treat it as a second- or third-class storage system. This is largely due to the short support interval given the actual complexity of the software. Establishing an LTS branch could win back sceptical admins who started looking for alternatives.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx