Sometimes some ceph-volume commands hang when trying to access some device. Please, take a look at the solution/steps provided by Adam in the thread with title "Issue adding host with cephadm - nothing is deployed" to check if the cephadm is waiting for some ceph-volume command to complete. Regards, Redo. On Tue, Nov 29, 2022 at 8:55 AM Volker Racho <rgsw4000@xxxxxxxxx> wrote: > Hi, > > ceph orch commands are not executed anymore in my cephadm-managed cluster > (17.2.3) and I don't see why. Cluster is healthy and overall working, > except for the orchestrator part. > > For instance, when I run `ceph orch redeploy ingress.rgw.default`, I see > the command in audit logs, cephadm also logs the command and > "_kick_serve_loop" and that's it. No more messages or errors (also not in > logs with debug level: ceph config set mgr mgr/cephadm/log_to_cluster_level > debug; ceph -W cephadm --watch-debug) but it never redeploys the service. > > Nov 21 07:54:45 ceph-0.yy.xxxx.net bash[1262]: debug > 2022-11-21T07:54:45.397+0000 7f7b6b527700 0 log_channel(audit) log [DBG] : > from='client.38766115 -' entity='client.admin' cmd=[{"prefix": "orch", > "action": "redeploy", "service_nam > Nov 21 07:54:45 ceph-0.yy.xxxx.net bash[1262]: debug > 2022-11-21T07:54:45.401+0000 7f7b6bd28700 0 [cephadm INFO root] Redeploy > service ingress.rgw.default > Nov 21 07:54:45 ceph-0.yy.xxxx.net bash[1262]: debug > 2022-11-21T07:54:45.401+0000 7f7b6bd28700 0 log_channel(cephadm) log [INF] > : Redeploy service ingress.rgw.default > Nov 21 07:54:45 ceph-0.yy.xxxx.net bash[1262]: debug > 2022-11-21T07:54:45.401+0000 7f7b6bd28700 0 log_channel(cephadm) log [DBG] > : _kick_serve_loop > Nov 21 07:54:45 ceph-0.yy.xxxx.net bash[1262]: debug > 2022-11-21T07:54:45.401+0000 7f7b6bd28700 0 log_channel(cephadm) log [DBG] > : _kick_serve_loop > > Same behaviour for many other ceph orch ... command including ceph orch > upgrade. > > # ceph orch status > Backend: cephadm > Available: Yes > Paused: No > > According to status, orchestrator is available and not paused. I have tried > to set the backend to "" and reset to "cephadm", I paused and resumed the > orchestrator, cleared progress entries and such but nothing could make the > cluster execute the commands. SSH connections between hosts are working. > > Any ideas how to fix or even debug this? I am a bit lost on this. > > Regards, SW. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx