ceph orch upgrade is stuck at the beginning

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, I've tried to upgrade our ceph cluster to pacific release (version 16.2.0 and then planned to move to each version 1 by 1) but it seems that on our cluster, it's failing

I've installed it (long time ago...) via cephadm on version v15 (I guess it was a v15.2.8 underneath at this time).

I remember having an issue with ceph mgr which leads to use ceph-base:latest-octopus to fix (next version wasn't released at this time and it was crashing the cluster by filling the logs)

the cluster state is OK:

  cluster:
    id:     adc48d6a-61bf-11eb-9212-2f70acf7224f
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum server36,server38,server37 (age 3h)
    mgr: server36.xujjng(active, since 2h), standbys: server37.fyglah
    osd: 52 osds: 52 up (since 4h), 52 in (since 4h)
    rgw: 2 daemons active (2 hosts, 1 zones)

  data:
    pools:   11 pools, 489 pgs
    objects: 1.28M objects, 4.8 TiB
    usage:   15 TiB used, 65 TiB / 81 TiB avail
    pgs:     489 active+clean

  io:
    client:   3.2 MiB/s rd, 24 MiB/s wr, 2.73k op/s rd, 1.81k op/s wr

So I tried "ceph orch upgrade start --ceph-version 16.2.0) and the first time it deployed a new ceph mgr with 16.2.0 version and got stuck here.
After waiting several hours, I stop and restarted and nothing happened.

I've then manually upgraded the whole cluster except the 2 rgw and grafana/prometheus/alertmanager/nodeexporter
I retried and I see nothing happening, in the different logs (in debug) (cephadm, logs of active mgr, ceph -W cephadm --watch-debug, ...)
I also tried with 16.2.1 now as it seems 16.2.0 wasn't working but I have the same effect

here's what I see for ceph -W cephadm --watch-debug:
2021-07-09T14:13:14.642077+0000 mgr.server36.xujjng [INF] Upgrade: Started with target docker.io/ceph/ceph:v16.2.1

and nothing

in mgr docker logs, I see (roughly) the same line and then debug stuff not related :
::ffff:127.0.0.1 - - [09/Jul/2021:15:08:33] "GET /metrics HTTP/1.1" 200 1423923 "" "Prometheus/2.18.1"
debug 2021-07-09T15:08:35.829+0000 7fa870818700  0 log_channel(cluster) log [DBG] : pgmap v3602: 489 pgs: 489 active+clean; 4.8 TiB data, 15 TiB used, 65 TiB / 81 TiB avail; 2.8 MiB/s rd, 37 MiB/s wr, 3.99k op/s
debug 2021-07-09T15:08:37.829+0000 7fa870818700  0 log_channel(cluster) log [DBG] : pgmap v3603: 489 pgs: 489 active+clean; 4.8 TiB data, 15 TiB used, 65 TiB / 81 TiB avail; 2.0 MiB/s rd, 30 MiB/s wr, 2.75k op/s

I don't see anything in cephadm logs

status is not very good also:

{
    "target_image": "docker.io/ceph/ceph:v16.2.1",
    "in_progress": true,
    "services_complete": [],
    "progress": "",
    "message": ""
}

Do you know where I could find some log / info in order to see why it doesn't start?

thanks!
Sylvain

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux