Re: CEPH Version choice

Marc <Marc@xxxxxxxxxxxxxxxxx> · Mon, 15 May 2023 11:03:36 +0000

> 
> We set up a test cluster with a script producing realistic workload and
> started testing an upgrade under load. This took about a month (meaning
> repeating the upgrade with a cluster on mimic deployed and populated

Hi Frank, do you have such scripts online? On github or so? I was thinking of compiling el9 rpms for Nautilus and run tests for a few days on a test cluster with mixed el7 and el9 hosts.

> 
> So to get back to my starting point, we admins actually value rock solid
> over features. I know that this is boring for devs, but nothing is worse
> than nobody using your latest and greatest - which probably was the
> motivation for your question. If the upgrade paths were more solid and
> things like the question "why does an OSD conversion not lead to an OSD
> that is identical to one deployed freshly" or "where does the
> performance go" would actually attempted to track down, we would be much
> less reluctant to upgrade.

> 
> I will bring it up here again: with the complexity that the code base
> reached now, the 2 year release cadence is way too fast, it doesn't
> provide sufficient maturity for upgrading fast as well. More and more
> admins will be several cycles behind and we are reaching the point where
> major bugs in so-called EOL versions will only be discovered before
> large clusters even reached this version. Which might become a
> fundamental blocker to upgrades entirely.

Indeed. 

> An alternative to increasing the release cadence would be to keep more
> cycles in the life-time loop instead of only the last 2 major releases.
> 4 years really is nothing when it comes to storage.
> 

I would like to see this change also.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx