I think I may have found the issue: https://tracker.ceph.com/issues/50526 It seems it may be fixed in: https://github.com/ceph/ceph/pull/41045 I hope this can be prioritized as an urgent fix as it's broken upgrades on clusters of a relatively normal size (14 nodes, 24x OSDs, 2x NVME for DB/WAL w/ 12 OSDs per NVME), even when new OSDs are not being deployed, as it still tries to apply the OSD specification. On Mon, May 10, 2021 at 4:03 PM David Orman <ormandj@xxxxxxxxxxxx> wrote: > > Hi, > > We are seeing the mgr attempt to apply our OSD spec on the various > hosts, then block. When we investigate, we see the mgr has executed > cephadm calls like so, which are blocking: > > root 1522444 0.0 0.0 102740 23216 ? S 17:32 0:00 > \_ /usr/bin/python3 > /var/lib/ceph/XXXXX/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90 > --image docker.io/ceph/ceph@sha256:694ba9cdcbe6cb7d25ab14b34113c42c2d1af18d4c79c7ba4d1f62cf43d145fe > ceph-volume --fsid XXXXX -- lvm list --format json > > This occurs on all hosts in the cluster, following > starting/restarting/failing over a manager. It's blocking an > in-progress upgrade post-manager updates on one cluster, currently. > > Looking at the cephadm logs on the host(s) in question, we see the > last entry appears to be truncated, like: > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman: > "ceph.db_uuid": "1n2f5v-EEgO-1Kn6-hQd2-v5QF-AN9o-XPkL6b", > 2021-05-10 17:32:06,471 INFO /usr/bin/podman: > "ceph.encrypted": "0", > 2021-05-10 17:32:06,471 INFO /usr/bin/podman: > "ceph.osd_fsid": "XXXX", > 2021-05-10 17:32:06,471 INFO /usr/bin/podman: > "ceph.osd_id": "205", > 2021-05-10 17:32:06,471 INFO /usr/bin/podman: > "ceph.osdspec_affinity": "osd_spec", > 2021-05-10 17:32:06,471 INFO /usr/bin/podman: > "ceph.type": "block", > > The previous entry looks like this: > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman: > "ceph.db_uuid": "TMTPD5-MLqp-06O2-raqp-S8o5-TfRG-hbFmpu", > 2021-05-10 17:32:06,469 INFO /usr/bin/podman: > "ceph.encrypted": "0", > 2021-05-10 17:32:06,469 INFO /usr/bin/podman: > "ceph.osd_fsid": "XXXX", > 2021-05-10 17:32:06,469 INFO /usr/bin/podman: > "ceph.osd_id": "195", > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: > "ceph.osdspec_affinity": "osd_spec", > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: > "ceph.type": "block", > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "ceph.vdo": "0" > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: }, > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "type": "block", > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: "vg_name": > "ceph-ffd1a4a7-316c-4c85-acde-06459e26f2c4" > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: } > 2021-05-10 17:32:06,470 INFO /usr/bin/podman: ], > > We'd like to get to the bottom of this, please let us know what other > information we can provide. > > Thank you, > David _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx