Re: Why you might want packages not containers for Ceph deployments

Peter Lieven <pl@xxxxxxx> · Wed, 17 Nov 2021 18:53:07 +0100

Am 17.11.21 um 18:09 schrieb Janne Johansson:
>> * I personally wouldn't want to run an LTS release based on ... what would
>> that be now.. Luminous + security patches??. IMO, the new releases really
>> are much more performant, much more scalable. N, O, and P are really much
>> much *much* better than previous releases. For example, I would not enable
>> snapshots on any cephfs except Pacific -- but do we really want to backport
>> the "stray dir splitting" improvement all the way back to mimic or L? -- to
>> me that seems extremely unwise and a waste of the developers' limited time.
> Then again, I am hit by the 13.2.7+ bug
> https://tracker.ceph.com/issues/43259 on one Mimic cluster.
>
> From what I can divine, it is a regression where some other fix or
> feature for rgw caused CopyObject to start failing for S3 clusters
> after 13.2.7.
> No fix or revert was done for mimic. Nor Nautilus. Then those two
> releases "timed out" and aren't considered worth the developers time
> anymore, so the ticket just got bumped to "fix later, and backport to
> Oct and Pac when that is done."
>
> So, while I would have liked to simply upgrade to N,O,P to fix this,
> none of them has the fix currently, so upgrading the cluster will not
> help me, and the N->O bump will incur one of those "an hour or more
> per hdd OSD" conversions which are kind of sad to plan for, and as we
> know the Pac release is currently not a fun place to upgrade into
> either.
>
> In my dream world, an LTS would not have seen a feature break another,
> and if it wasn't a feature but some fix for something else then
> someone would have to make a decision if it is ok to cause one bug to
> fix another and revert the fix if it wasn't ok to break old features.
> As is it now, a third option was possible, to let it slip and not-fix
> it at all and just tag it for later because so much else has changed
> by now. It got "fixed" in master after almost two years but the
> backports to O+P has been stale for 4 months, which means I will have
> to make several jumps and then move to Q early on when it arrives in
> order to get back to the working setup I had at 13.2.6..
>
> This is at least my reason for wanting something like an LTS version.
>
>
Thanks for all your feedback. I hope we can all meet tomorrow in the Ceph Dev User meeting!

I have to say, that we use ceph only for RBD VM storage at the moment. For that application Nautilus seems to be

a good choice. It lacks RBD encryption and read leases. But for us upgrading from N to O or P is currently not

possible from an operational point of view. Mainly because of the waiting for readable issue when an OSD restarts.

This was the main reason to start with N and not O back 1 year ago.

So I can live with the lack of RBD encryption and just do it in the hypervisor.

N has no issue when a large numer of OSDs going up/down, rebalancing in the order of serveral tens of GB/s has nearly no

client impact if the wrong default for osd_op_queue_cut_off is corrected.

I do not know if it works as smooth under O or P once the laggy PG issue is fixed, but I'm afraid to test it.

I have a small dev environment, but that does not reflect production use patterns, I/O loads and size.

I'm seriously afraid to find out that it does not work as smooth after having upgraded to O or P with no way back.

The main reason is that we do not have e.g. an S3 gateway that is just slow for some time. I have several thousand VMs that

might have serious issues if there are I/O stalls or slow I/O for more than just a few (1-2) seconds.

We already had to find out that bug fixes where not backported even when N was still maintained, so we already use the

latest N release with very few additional fixes. If the people longing for an LTS release mainly are those who are

using Ceph as VM Storage, we could use this as a basis.

Thanks,

Peter

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx