Re: Why you might want packages not containers for Ceph deployments

Stefan Kooman <stefan@xxxxxx> · Wed, 17 Nov 2021 19:08:24 +0100

On 11/17/21 18:40, Dave Hall wrote:
Sorry to be a bit edgy, but...

So at least 5 customers that you know of have a test cluster, or do you 
have 5 test clusters?  

We have 5 test clusters.

So 5 test clusters out of how many total Ceph
clusters worldwide.

There is a dashboard for that ... but not all clusters out there can or 
want to send metrics. A few out of many thousands of clusters for sure. 
For corner cases a few of these clusters would not be enough. But I 
doubt that what we have seen are corner cases.

Answers like this miss the point.  Ceph is an amazing concept.  That it 
is Open Source makes it more amazing by 10x.  But storage is big, like 
glaciers and tectonic plates.  The potential to lose or lose access to 
millions of files/objects or petabytes of data is enough to keep you up 
at night.

Many of us out here have become critically dependent on Ceph storage, 
and probably most of us can barely afford our production clusters, much 
less a test cluster.

The best I could do right now today for a test cluster would be 3 
Virtualbox VMs with about 10GB of disk each.  Does anybody out there 
think I could find my way past some of the more gnarly O and P issues 
with this as my test cluster?

You will have a hard time with those small disks. And run into ENOSPACE 
bluefs issues real fast.

And yes, I do think some of the issues that have arose could have been 
catched with long running test clusters (that go from one release to the 
other).

The real point here:  From what I'm reading in this mailing list it 
appears that most non-developers are currently afraid to risk an upgrade 
to Octopus or Pacific.  If this is an accurate perception then THIS IS 
THE ONLY PROBLEM.

Don't shame the users who are more concerned about stability than fresh 
paint.

I don't want to shame any one user. We are users ourselves, and we are 
concerned about our data (and that of our customers). Keeping Ceph 
HEALTH_OK is a key priority.

I agree that the issue does not seem to so much about upgrading per se, 
but not trusting the quality of the newer versions.
Sage made "quality" a key focus point during Cephalocon in Barcelona. So 
it is definitely on the radar. It might just not have been enough, or in 
the right places, to avoid the issues we have seen.

The RC release sounds like a good improvement. And maybe we should think 
about building clusters especially for this purpose.

Gr. Stefan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx