On Tue, May 17, 2022 at 1:14 PM Cory Snyder <csnyder@xxxxxxxxx> wrote: > > Hi all, > > Unfortunately, we experienced some issues with the upgrade to 16.2.8 > on one of our larger clusters. Within a few hours of the upgrade, all > 5 of our managers had become unavailable. We found that they were all > deadlocked due to (what appears to be) a regression with GIL and mutex > handling. See https://tracker.ceph.com/issues/39264 and > https://github.com/ceph/ceph/pull/38677 for context on previous > manifestations of the issue. > > I discovered some mistakes within a recent Pacific backport that seem > to be responsible. Here is the tracker for the regression: > https://tracker.ceph.com/issues/55687. Here is an open PR that should > resolve the problem: https://github.com/ceph/ceph/pull/38677. I guess you mean https://github.com/ceph/ceph/pull/46302 ? Thanks .. dan > > Note that this is a sort of race condition, and the issue tends to > manifest itself more frequently in larger clusters. Enabling certain > modules may also make it more likely to occur. On our cluster, MGRs > are consistently deadlocking within about an hour. > > Hopefully this is useful to others who are considering an upgrade! > > Thanks, > > Cory Snyder > > > > > > > On Mon, May 16, 2022 at 3:46 PM David Galloway <dgallowa@xxxxxxxxxx> wrote: > > > > We're happy to announce the 8th backport release in the Pacific series. > > We recommend users to update to this release. For a detailed release > > notes with links & changelog please refer to the official blog entry at > > https://ceph.io/en/news/blog/2022/v16-2-8-pacific-released > > > > Notable Changes > > --------------- > > > > * MON/MGR: Pools can now be created with `--bulk` flag. Any pools > > created with `bulk` will use a profile of the `pg_autoscaler` that > > provides more performance from the start. However, any pools created > > without the `--bulk` flag will remain using it's old behavior by > > default. For more details, see: > > https://docs.ceph.com/en/latest/rados/operations/placement-groups/ > > > > * MGR: The pg_autoscaler can now be turned `on` and `off` globally with > > the `noautoscale` flag. By default this flag is unset and the default > > pg_autoscale mode remains the same. For more details, see: > > https://docs.ceph.com/en/latest/rados/operations/placement-groups/ > > > > * A health warning will now be reported if the ``require-osd-release`` > > flag is not set to the appropriate release after a cluster upgrade. > > > > * CephFS: Upgrading Ceph Metadata Servers when using multiple active > > MDSs requires ensuring no pending stray entries which are directories > > are present for active ranks except rank 0. See > > https://docs.ceph.com/en/latest/releases/pacific/#upgrading-from-octopus-or-nautilus. > > > > Getting Ceph > > ------------ > > * Git at git://github.com/ceph/ceph.git > > * Tarball at https://download.ceph.com/tarballs/ceph-16.2.8.tar.gz > > * Containers at https://quay.io/repository/ceph/ceph > > * For packages, see https://docs.ceph.com/docs/master/install/get-packages/ > > * Release git sha1: 209e51b856505df4f2f16e54c0d7a9e070973185 > > > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx