Hello Dave, > The potential to lose or lose access to millions of files/objects or petabytes of data is enough to keep you up at night. > Many of us out here have become critically dependent on Ceph storage, and probably most of us can barely afford our production clusters, much less a test cluster. Please remember, free software comes still with a price. You can not expect someone to work on your individual problem while being cheap on your highly critical data. If your data has value, then you should invest in ensuring data safety. There are companies out, paying Ceph developers and fixing bugs, so your problem will be gone as soon as you A) contribute code yourself or B) pay someone to contribute code. Don't get me wrong, every dev here should have the focus in providing rock solid work and I believe they do, but in the end it's software, and software never will be free of bugs. Ceph does quite a good job protecting your data, and in my personal experience, if you don't do crazy stuff and execute even crazier commands with "yes-i-really-mean-it", you usually don't lose data. > The real point here: From what I'm reading in this mailing list it appears that most non-developers are currently afraid to risk an upgrade to Octopus or Pacific. If this is an accurate perception then THIS IS THE ONLY PROBLEM. Octopus is one of the best releases ever. Often our support engineers do upgrade old unmaintained installations from some super old release to Octopus to get them running again or have propper tooling to fix the issue. But I agree, we as croit are still afraid of pushing our users to Pacific, as we encounter bugs in our tests. This however will change soon, as we are close to a stable enough Pacific release as we believe. -- Martin Verges Managing director Mobile: +49 174 9335695 | Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx On Wed, 17 Nov 2021 at 18:41, Dave Hall <kdhall@xxxxxxxxxxxxxx> wrote: > Sorry to be a bit edgy, but... > > So at least 5 customers that you know of have a test cluster, or do you > have 5 test clusters? So 5 test clusters out of how many total Ceph > clusters worldwide. > > Answers like this miss the point. Ceph is an amazing concept. That it is > Open Source makes it more amazing by 10x. But storage is big, like > glaciers and tectonic plates. The potential to lose or lose access to > millions of files/objects or petabytes of data is enough to keep you up at > night. > > Many of us out here have become critically dependent on Ceph storage, and > probably most of us can barely afford our production clusters, much less a > test cluster. > > The best I could do right now today for a test cluster would be 3 > Virtualbox VMs with about 10GB of disk each. Does anybody out there think > I could find my way past some of the more gnarly O and P issues with this > as my test cluster? > > The real point here: From what I'm reading in this mailing list it appears > that most non-developers are currently afraid to risk an upgrade to Octopus > or Pacific. If this is an accurate perception then THIS IS THE ONLY > PROBLEM. > > Don't shame the users who are more concerned about stability than fresh > paint. > > -Dave > > -- > Dave Hall > Binghamton University > kdhall@xxxxxxxxxxxxxx > > On Wed, Nov 17, 2021 at 11:18 AM Stefan Kooman <stefan@xxxxxx> wrote: > > > On 11/17/21 16:19, Marc wrote: > > >> The CLT is discussing a more feasible alternative to LTS, namely to > > >> publish an RC for each point release and involve the user community to > > >> help test it. > > > > > > How many users even have the availability of a 'test cluster'? > > > > At least 5 (one physical 3 node). We installed a few of them with the > > exact same version as when we started prod (luminous 12.2.4 IIRC) and > > upgraded ever since. Especially for cases where old pieces of metadata > > might cause issues in the long run (pre jewel blows up in pacific for > > MDS case). Same for the osd OMAP conversion troubles in pacific. > > Especially in these cases Ceph testing on real prod might have revealed > > that. A VM enviroment would be ideal for this. As you could just > > snapshot state and play back when needed. Ideally MDS / RGW / RBD > > workloads on them to make sure all use cases are tested. > > > > But these cluster have not the same load as prod. Not the same data ... > > so still stuff might break in special ways. But at least we try to avoid > > that as much as possible. > > > > Gr. Stefan > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx