Hi Adam,
Thanks a lot, the solution worked, cluster is back to health OK status.
Best regards,
Stephane
Le 11/07/2022 à 15:40, Adam King a écrit :
This sounds similar to something I saw once with an upgrade from
17.2.0 to 17.2.1 (that I failed to reproduce). In that case, what
fixed it was stopping the upgrade, manually redeploying both mgr
daemons with the new version ("ceph orch daemon redeploy
<standby-mgr-daemon-name> --image <image-for-version-upgrading-to>",
wait a few minutes for the redeploy to happen, "ceph mgr fail", wait a
minute, same redeploy command but for the other mgr). After doing that
and starting the upgrade up again it seemed to go okay. Also, I'd
recommend using "--image" for the upgrade command over
"--ceph-version". Somebody else also had an upgrade issue and also
happened to be using that "--ceph-version" flag
https://tracker.ceph.com/issues/56485 which has got me wondering if
there's a bug with it.
On Fri, Jul 8, 2022 at 10:37 AM Stéphane Caminade
<stephane.caminade@xxxxxxxxxxxxx> wrote:
Hi,
I'm still a little stuck with this situation, no clues?
Regards,
Stephane
Le 29/06/2022 à 10:34, Stéphane Caminade a écrit :
> Dear list,
>
> After an upgrade from package-based cluster running 16.2.9, to
cephadm
> with docker (following
> https://docs.ceph.com/en/pacific/cephadm/adoption/), I have a
strange
> discrepancy between the running versions:
>
> /ceph versions//
> //{//
> // "mon": {//
> // "ceph version 16.2.9
> (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 3//
> // },//
> // "mgr": {//
> // "ceph version 16.2.9
> (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 2//
> // },//
> // "osd": {//
> // "ceph version 16.2.9
> (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 181//
> // },//
> // "mds": {//
> // "ceph version 16.2.5-387-g7282d81d
> (7282d81d2c500b5b0e929c07971b72444c6ac424) pacific (stable)": 3//
> // },//
> // "overall": {//
> // "ceph version 16.2.5-387-g7282d81d
> (7282d81d2c500b5b0e929c07971b72444c6ac424) pacific (stable)": 3,//
> // "ceph version 16.2.9
> (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 186//
> // }//
> //}//
> /
>
> I tried asking cephadm to upgrade to 16.2.9 ( /ceph orch upgrade
start
> --ceph-version 16.2.9/ ), but it only seems to cycle between the
> active managers (about every 15 to 20s), without doing anything
more.
> Here is a part of the logs from one of the MGR:
>
> /7f599cbe9700 0 [cephadm INFO cephadm.upgrade] Upgrade: Need to
> upgrade myself (mgr.inf-ceph-mds)//
> //7f599cbe9700 0 log_channel(cephadm) log [INF] : Upgrade: Need to
> upgrade myself (mgr.inf-ceph-mds)//
> //7f599cbe9700 0 [cephadm INFO cephadm.services.cephadmservice]
> Failing over to other MGR//
> //7f599cbe9700 0 log_channel(cephadm) log [INF] : Failing over to
> other MGR//
> /
>
> Something strange as well, it seems that it is looking for more
> daemons (191) to upgrade than the 189 (from 186 in 16.2.9 and 3 in
> 16.2.5-387-nnnn):
>
> /ceph orch upgrade status//
> //{//
> // "target_image":
>
"quay.io/ceph/ceph@sha256:5d3c9f239598e20a4ed9e08b8232ef653f5c3f32710007b4cabe4bd416bebe54
<http://quay.io/ceph/ceph@sha256:5d3c9f239598e20a4ed9e08b8232ef653f5c3f32710007b4cabe4bd416bebe54>",//
> // "in_progress": true,//
> // "services_complete": [//
> // "mgr",//
> // "mon"//
> // ],//
> // "progress": "187/191 daemons upgraded",//
> // "message": ""//
> //}/
>
> So I have two questions:
>
> 1. Do you have any pointers as to where I could look for
information
> on what is going on (or not, actually)?
>
> 2. Would it be safe to stop the upgrade, and ask it to safely
move to
> 17.2.1 instead?
>
> Best regards,
>
> Stephane
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx