> > We set up a test cluster with a script producing realistic workload and > started testing an upgrade under load. This took about a month (meaning > repeating the upgrade with a cluster on mimic deployed and populated Hi Frank, do you have such scripts online? On github or so? I was thinking of compiling el9 rpms for Nautilus and run tests for a few days on a test cluster with mixed el7 and el9 hosts. > > So to get back to my starting point, we admins actually value rock solid > over features. I know that this is boring for devs, but nothing is worse > than nobody using your latest and greatest - which probably was the > motivation for your question. If the upgrade paths were more solid and > things like the question "why does an OSD conversion not lead to an OSD > that is identical to one deployed freshly" or "where does the > performance go" would actually attempted to track down, we would be much > less reluctant to upgrade. > > I will bring it up here again: with the complexity that the code base > reached now, the 2 year release cadence is way too fast, it doesn't > provide sufficient maturity for upgrading fast as well. More and more > admins will be several cycles behind and we are reaching the point where > major bugs in so-called EOL versions will only be discovered before > large clusters even reached this version. Which might become a > fundamental blocker to upgrades entirely. Indeed. > An alternative to increasing the release cadence would be to keep more > cycles in the life-time loop instead of only the last 2 major releases. > 4 years really is nothing when it comes to storage. > I would like to see this change also. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx