Am 17.11.21 um 18:09 schrieb Janne Johansson: >> * I personally wouldn't want to run an LTS release based on ... what would >> that be now.. Luminous + security patches??. IMO, the new releases really >> are much more performant, much more scalable. N, O, and P are really much >> much *much* better than previous releases. For example, I would not enable >> snapshots on any cephfs except Pacific -- but do we really want to backport >> the "stray dir splitting" improvement all the way back to mimic or L? -- to >> me that seems extremely unwise and a waste of the developers' limited time. > Then again, I am hit by the 13.2.7+ bug > https://tracker.ceph.com/issues/43259 on one Mimic cluster. > > From what I can divine, it is a regression where some other fix or > feature for rgw caused CopyObject to start failing for S3 clusters > after 13.2.7. > No fix or revert was done for mimic. Nor Nautilus. Then those two > releases "timed out" and aren't considered worth the developers time > anymore, so the ticket just got bumped to "fix later, and backport to > Oct and Pac when that is done." > > So, while I would have liked to simply upgrade to N,O,P to fix this, > none of them has the fix currently, so upgrading the cluster will not > help me, and the N->O bump will incur one of those "an hour or more > per hdd OSD" conversions which are kind of sad to plan for, and as we > know the Pac release is currently not a fun place to upgrade into > either. > > In my dream world, an LTS would not have seen a feature break another, > and if it wasn't a feature but some fix for something else then > someone would have to make a decision if it is ok to cause one bug to > fix another and revert the fix if it wasn't ok to break old features. > As is it now, a third option was possible, to let it slip and not-fix > it at all and just tag it for later because so much else has changed > by now. It got "fixed" in master after almost two years but the > backports to O+P has been stale for 4 months, which means I will have > to make several jumps and then move to Q early on when it arrives in > order to get back to the working setup I had at 13.2.6.. > > This is at least my reason for wanting something like an LTS version. > > Thanks for all your feedback. I hope we can all meet tomorrow in the Ceph Dev User meeting! I have to say, that we use ceph only for RBD VM storage at the moment. For that application Nautilus seems to be a good choice. It lacks RBD encryption and read leases. But for us upgrading from N to O or P is currently not possible from an operational point of view. Mainly because of the waiting for readable issue when an OSD restarts. This was the main reason to start with N and not O back 1 year ago. So I can live with the lack of RBD encryption and just do it in the hypervisor. N has no issue when a large numer of OSDs going up/down, rebalancing in the order of serveral tens of GB/s has nearly no client impact if the wrong default for osd_op_queue_cut_off is corrected. I do not know if it works as smooth under O or P once the laggy PG issue is fixed, but I'm afraid to test it. I have a small dev environment, but that does not reflect production use patterns, I/O loads and size. I'm seriously afraid to find out that it does not work as smooth after having upgraded to O or P with no way back. The main reason is that we do not have e.g. an S3 gateway that is just slow for some time. I have several thousand VMs that might have serious issues if there are I/O stalls or slow I/O for more than just a few (1-2) seconds. We already had to find out that bug fixes where not backported even when N was still maintained, so we already use the latest N release with very few additional fixes. If the people longing for an LTS release mainly are those who are using Ceph as VM Storage, we could use this as a basis. Thanks, Peter _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx