Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Before I started the upgrade the cluster was healthy but one OSD(osd.355) was down, can't remember if it was in or out.
Upgrade was started with
ceph orch upgrade start --image goharbor.example.com/library/ceph/ceph:v15.2.9

The upgrade started but when Ceph tried to upgrade osd.355 it paused with the following messages:

2021-03-11T09:15:35.638104+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: Target is goharbor.example.com/library/ceph/ceph:v15.2.9 with id dfc48307963697ff48acd9dd6fda4a7a24017b9d8124f86c2
a542b0802fe77ba
2021-03-11T09:15:35.639882+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: Checking mgr daemons... 2021-03-11T09:15:35.644170+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: All mgr daemons are up to date. 2021-03-11T09:15:35.644376+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: Checking mon daemons... 2021-03-11T09:15:35.647669+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: All mon daemons are up to date. 2021-03-11T09:15:35.647866+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: Checking crash daemons... 2021-03-11T09:15:35.652035+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: Setting container_image for all crash... 2021-03-11T09:15:35.653683+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: All crash daemons are up to date. 2021-03-11T09:15:35.653896+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: Checking osd daemons... 2021-03-11T09:15:36.273345+0000 mgr.pech-mon-2.cjeiyc [INF] It is presumed safe to stop ['osd.355'] 2021-03-11T09:15:36.273504+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: It is presumed safe to stop ['osd.355'] 2021-03-11T09:15:36.273887+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade: Redeploying osd.355 2021-03-11T09:15:36.276673+0000 mgr.pech-mon-2.cjeiyc [ERR] Upgrade: Paused due to UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.355 on host pech-hd-009 failed.


One of the first ting the upgrade did was to upgrade mon, so they are restarted and now the osd.355 no longer exist

    $ ceph osd info osd.355
    Error EINVAL: osd.355 does not exist

But if I run a resume
    ceph orch upgrade resume
it still tries to upgrade osd.355, same message as above.

I tried to stop and start the upgrade again with
    ceph orch upgrade stop
ceph orch upgrade start --image goharbor.example.com/library/ceph/ceph:v15.2.9
it still tries to upgrade osd.355, with the same message as above.

Looking at the source code it looks like it get daemons to upgrade from mgr cache, so I restarted both mgr but still it tries to upgrade osd.355.


Does anyone know how I can get the upgrade to continue?

--
Kai Stian Olstad
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux