Re: "ceph orch" not working anymore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Eugen & Redouane,

of course I tried enabling and disabling the cephadm module for the MGRs.

Running ceph mgr module enable cephadm produces this output in the MGR log:

-1 mgr load Failed to construct class in 'cephadm'
 -1 mgr load Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 619, in __init__
    self.to_remove_osds.load_from_store()
File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 922, in load_from_store
    for osd in json.loads(v):
  File "/lib64/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/lib64/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/lib64/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
 -1 mgr operator() Failed to run module in active mode ('cephadm')

This comes from inside the MGR container because it's Python3.9. On the hosts it'S Python3.11.

I think of redeploying an MGR.

Can I stop the existing MGRs?

Redeploying with ceph orch does not work of course, but I think this will work:

https://docs.ceph.com/en/latest/cephadm/troubleshooting/#manually-deploying-a-manager-daemon

because cephadm standalone is working. Crazy as it sounds.

What do you think?

Best,
Malte

On 17.10.24 12:49, Eugen Block wrote:
Hi,

if you just execute cephadm commands, those are issued locally on the hosts, they won't confirm an orchestrator issue immediately. What does the active MGR log? It could show a stack trace or error messages which could point to a root cause.

What about the cephadm files under /var/lib/ceph/fsid? Can I replace the latest?

Those are the cephadm versions the orchestrator actually uses, it will just download them again from your registry (or upstream).
Can you share:

ceph -s
ceph versions
MGR logs (active MGR)

Thanks,
Eugen

Zitat von Malte Stroem <malte.stroem@xxxxxxxxx>:

Hello,

I am still struggling here and do not know the root cause of this issue.

Searching the list I found lots of people who had the same or a similar problem the last years.

However there is no solution four our cluster.

Disabling and enabling the cephadm module does not work. There are no error messages. When we run "ceph orch..." we get the error message:

Error ENOENT: No orchestrator configured (try `ceph orch set backend`)

But every single cephadm command works!

cephadm ls for example.

Stopping and restarting the MGRs did not help. Removing the .asok files did not help.

I think of stopping both MGRs and trying to deploy a new MGR like this:

https://docs.ceph.com/en/latest/cephadm/troubleshooting/#manually- deploying-a-manager-daemon

How could I find the root cause? Is the cephadm somehow broken?

What about the cephadm files under /var/lib/ceph/fsid? Can I replace the latest?

Best,
Malte

On 16.10.24 14:54, Malte Stroem wrote:
Hi Laimis,

that did not work. Still ceph orch does not work.

Best,
Malte

On 16.10.24 14:12, Malte Stroem wrote:
Thank you, Laimis.

And you got the same error message? That's strange.

In the mean time I try to check for clients connected. No Kubernetes and CephFS, but RGWs.

Best,
Malte

On 16.10.24 14:01, Laimis Juzeliūnas wrote:
Hi Malte,

We have faced this recently when upgrading to Squid from latest Reef.
As a temporary workaround we disabled the balancer with ‘ceph balancer off’ and restarted mgr daemons. We are suspecting older clients (from Kubernetes RBD mounts as well as CephFS mounts) on servers with incompatible client versions but are yet to dig through it.

Best,
Laimis J.

On 16 Oct 2024, at 14:57, Malte Stroem <malte.stroem@xxxxxxxxx> wrote:

Error ENOENT: No orchestrator configured (try `ceph orch set backend`)

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux