Hi- I tried to start the upgrade using ceph orch upgrade start --ceph-version 18.2.1 Initiating upgrade to quay.io/ceph/ceph:v18:v18.2.1 And checked on the status [root@rhel1 ~]# ceph orch upgrade status { "target_image": "quay.io/ceph/ceph:v18:v18.2.1", "in_progress": true, "which": "Upgrading all daemon types on all hosts", "services_complete": [], "progress": "", "message": "Error: UPGRADE_FAILED_PULL: Upgrade: failed to pull target image", "is_paused": true } However this does work: [root@rhel1 ~]# ceph orch upgrade start --image quay.io/ceph/ceph:v18.2.1 Initiating upgrade to quay.io/ceph/ceph:v18.2.1 It looks like something is inserting an incorrect v18 when I tried to do the update by version. Once I used the image, the upgrade finished quickly -Rob -----Original Message----- From: Yuri Weinstein <yweinste@xxxxxxxxxx> Sent: Monday, December 18, 2023 4:20 PM To: ceph-announce@xxxxxxx; ceph-users <ceph-users@xxxxxxx>; dev <dev@xxxxxxx>; ceph-maintainers@xxxxxxx Subject: v18.2.1 Reef released We're happy to announce the 1st backport release in the Reef series. This is the first backport release in the Reef series, and the first with Debian packages, for Debian Bookworm. We recommend all users update to this release. https://ceph.io/en/news/blog/2023/v18-2-1-reef-released/ Notable Changes --------------- * RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in multi-site. Previously, the replicas of such objects were corrupted on decryption. A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to identify these original multipart uploads. The ``LastModified`` timestamp of any identified object is incremented by 1ns to cause peer zones to replicate it again. For multi-site deployments that make any use of Server-Side Encryption, we recommended running this command against every bucket in every zone after all zones have upgraded. * CEPHFS: MDS evicts clients which are not advancing their request tids which causes a large buildup of session metadata resulting in the MDS going read-only due to the RADOS operation exceeding the size threshold. `mds_session_metadata_threshold` config controls the maximum size that a (encoded) session metadata can grow. * RGW: New tools have been added to radosgw-admin for identifying and correcting issues with versioned bucket indexes. Historical bugs with the versioned bucket index transaction workflow made it possible for the index to accumulate extraneous "book-keeping" olh entries and plain placeholder entries. In some specific scenarios where clients made concurrent requests referencing the same object key, it was likely that a lot of extra index entries would accumulate. When a significant number of these entries are present in a single bucket index shard, they can cause high bucket listing latencies and lifecycle processing failures. To check whether a versioned bucket has unnecessary olh entries, users can now run ``radosgw-admin bucket check olh``. If the ``--fix`` flag is used, the extra entries will be safely removed. A distinct issue from the one described thus far, it is also possible that some versioned buckets are maintaining extra unlinked objects that are not listable from the S3/ Swift APIs. These extra objects are typically a result of PUT requests that exited abnormally, in the middle of a bucket index transaction - so the client would not have received a successful response. Bugs in prior releases made these unlinked objects easy to reproduce with any PUT request that was made on a bucket that was actively resharding. Besides the extra space that these hidden, unlinked objects consume, there can be another side effect in certain scenarios, caused by the nature of the failure mode that produced them, where a client of a bucket that was a victim of this bug may find the object associated with the key to7fe91d5d5842e04be3b4f514d6dd990c54b29c76 be in an inconsistent state. To check whether a versioned bucket has unlinked entries, users can now run ``radosgw-admin bucket check unlinked``. If the ``--fix`` flag is used, the unlinked objects will be safely removed. Finally, a third issue made it possible for versioned bucket index stats to be accounted inaccurately. The tooling for recalculating versioned bucket stats also had a bug, and was not previously capable of fixing these inaccuracies. This release resolves those issues and users can now expect that the existing ``radosgw-admin bucket check`` command will produce correct results. We recommend that users with versioned buckets, especially those that existed on prior releases, use these new tools to check whether their buckets are affected and to clean them up accordingly. * mgr/snap-schedule: For clusters with multiple CephFS file systems, all the snap-schedule commands now expect the '--fs' argument. * RADOS: A POOL_APP_NOT_ENABLED health warning will now be reported if the application is not enabled for the pool irrespective of whether the pool is in use or not. Always add ``application`` label to a pool to avoid reporting of POOL_APP_NOT_ENABLED health warning for that pool. The user might temporarilty mute this warning using ``ceph health mute POOL_APP_NOT_ENABLED``. Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at https://download.ceph.com/tarballs/ceph-18.2.1.tar.gz * Containers at https://quay.io/repository/ceph/ceph * For packages, see https://docs.ceph.com/en/latest/install/get-packages/ * Release git sha1: 7fe91d5d5842e04be3b4f514d6dd990c54b29c76 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx