Hi Kai, looks like $ ssh pech-hd-009 # cephadm ls is returning this non-existent OSDs. can you verify that `cephadm ls` on that host doesn't print osd.355 ? Best, Sebastian Am 11.03.21 um 12:16 schrieb Kai Stian Olstad: > Before I started the upgrade the cluster was healthy but one > OSD(osd.355) was down, can't remember if it was in or out. > Upgrade was started with > ceph orch upgrade start --image > goharbor.example.com/library/ceph/ceph:v15.2.9 > > The upgrade started but when Ceph tried to upgrade osd.355 it paused > with the following messages: > > 2021-03-11T09:15:35.638104+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Target is goharbor.example.com/library/ceph/ceph:v15.2.9 with id > dfc48307963697ff48acd9dd6fda4a7a24017b9d8124f86c2 > a542b0802fe77ba > 2021-03-11T09:15:35.639882+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking mgr daemons... > 2021-03-11T09:15:35.644170+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > All mgr daemons are up to date. > 2021-03-11T09:15:35.644376+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking mon daemons... > 2021-03-11T09:15:35.647669+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > All mon daemons are up to date. > 2021-03-11T09:15:35.647866+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking crash daemons... > 2021-03-11T09:15:35.652035+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Setting container_image for all crash... > 2021-03-11T09:15:35.653683+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > All crash daemons are up to date. > 2021-03-11T09:15:35.653896+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Checking osd daemons... > 2021-03-11T09:15:36.273345+0000 mgr.pech-mon-2.cjeiyc [INF] It is > presumed safe to stop ['osd.355'] > 2021-03-11T09:15:36.273504+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > It is presumed safe to stop ['osd.355'] > 2021-03-11T09:15:36.273887+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: > Redeploying osd.355 > 2021-03-11T09:15:36.276673+0000 mgr.pech-mon-2.cjeiyc [ERR] Upgrade: > Paused due to UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.355 on host > pech-hd-009 failed. > > > One of the first ting the upgrade did was to upgrade mon, so they are > restarted and now the osd.355 no longer exist > > $ ceph osd info osd.355 > Error EINVAL: osd.355 does not exist > > But if I run a resume > ceph orch upgrade resume > it still tries to upgrade osd.355, same message as above. > > I tried to stop and start the upgrade again with > ceph orch upgrade stop > ceph orch upgrade start --image > goharbor.example.com/library/ceph/ceph:v15.2.9 > it still tries to upgrade osd.355, with the same message as above. > > Looking at the source code it looks like it get daemons to upgrade from > mgr cache, so I restarted both mgr but still it tries to upgrade osd.355. > > > Does anyone know how I can get the upgrade to continue? > > -- > Kai Stian Olstad > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx