Re: Cephadm: Upgrade 15.2.5 -> 15.2.9 stops on non existing OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kai,

looks like

$ ssh pech-hd-009
# cephadm ls

is returning this non-existent OSDs.

can you verify that `cephadm ls` on that host doesn't
print osd.355 ?

Best,
Sebastian

Am 11.03.21 um 12:16 schrieb Kai Stian Olstad:
> Before I started the upgrade the cluster was healthy but one
> OSD(osd.355) was down, can't remember if it was in or out.
> Upgrade was started with
>     ceph orch upgrade start --image
> goharbor.example.com/library/ceph/ceph:v15.2.9
> 
> The upgrade started but when Ceph tried to upgrade osd.355 it paused
> with the following messages:
> 
>     2021-03-11T09:15:35.638104+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Target is goharbor.example.com/library/ceph/ceph:v15.2.9 with id
> dfc48307963697ff48acd9dd6fda4a7a24017b9d8124f86c2
> a542b0802fe77ba
>     2021-03-11T09:15:35.639882+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking mgr daemons...
>     2021-03-11T09:15:35.644170+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> All mgr daemons are up to date.
>     2021-03-11T09:15:35.644376+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking mon daemons...
>     2021-03-11T09:15:35.647669+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> All mon daemons are up to date.
>     2021-03-11T09:15:35.647866+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking crash daemons...
>     2021-03-11T09:15:35.652035+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Setting container_image for all crash...
>     2021-03-11T09:15:35.653683+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> All crash daemons are up to date.
>     2021-03-11T09:15:35.653896+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Checking osd daemons...
>     2021-03-11T09:15:36.273345+0000 mgr.pech-mon-2.cjeiyc [INF] It is
> presumed safe to stop ['osd.355']
>     2021-03-11T09:15:36.273504+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> It is presumed safe to stop ['osd.355']
>     2021-03-11T09:15:36.273887+0000 mgr.pech-mon-2.cjeiyc [INF] Upgrade:
> Redeploying osd.355
>     2021-03-11T09:15:36.276673+0000 mgr.pech-mon-2.cjeiyc [ERR] Upgrade:
> Paused due to UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.355 on host
> pech-hd-009 failed.
> 
> 
> One of the first ting the upgrade did was to upgrade mon, so they are
> restarted and now the osd.355 no longer exist
> 
>     $ ceph osd info osd.355
>     Error EINVAL: osd.355 does not exist
> 
> But if I run a resume
>     ceph orch upgrade resume
> it still tries to upgrade osd.355, same message as above.
> 
> I tried to stop and start the upgrade again with
>     ceph orch upgrade stop
>     ceph orch upgrade start --image
> goharbor.example.com/library/ceph/ceph:v15.2.9
> it still tries to upgrade osd.355, with the same message as above.
> 
> Looking at the source code it looks like it get daemons to upgrade from
> mgr cache, so I restarted both mgr but still it tries to upgrade osd.355.
> 
> 
> Does anyone know how I can get the upgrade to continue?
> 
> -- 
> Kai Stian Olstad
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 

-- 
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux