Hi all, I have a Ceph 17.2.5 cluster deployed via cephadm. After a few reboots it has now entered a fairly broken state as shown below. I am having trouble even beginning to diagnose this as a lot of the commands just hang. For example “cephadm ps”, “ceph orch ls” just hang forever. Other commands like “ceph pg 7.4e query” return JSON errors. As it stands, the CephFS filesystem is inaccessible, as too is my RBD mount onto Windows Server 2019. Even though the cluster says HEALTH_WARN, it seems to be in a pretty terminal state right now ☹ I wonder if any of you wonderful people could help point me in the right direction? root@c-dc01-ceph01:~# ceph pg 7.4e query Couldn't parse JSON : Expecting value: line 1 column 1 (char 0) Traceback (most recent call last): File "/usr/bin/ceph", line 1326, in <module> retval = main() File "/usr/bin/ceph", line 1246, in main sigdict = parse_json_funcsigs(outbuf.decode('utf-8'), 'cli') File "/usr/lib/python3/dist-packages/ceph_argparse.py", line 993, in parse_json_funcsigs raise e File "/usr/lib/python3/dist-packages/ceph_argparse.py", line 990, in parse_json_funcsigs overall = json.loads(s) File "/usr/lib/python3.8/json/__init__.py", line 357, in loads return _default_decoder.decode(s) File "/usr/lib/python3.8/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) root@c-dc01-ceph01:~# ceph status cluster: id: 2a6ec9f2-56c4-11ed-a428-bdec5d6d07e0 health: HEALTH_WARN 3 failed cephadm daemon(s) 1 filesystem is degraded 1 MDSs report slow metadata IOs Reduced data availability: 6686 pgs inactive, 5982 pgs peering services: mon: 3 daemons, quorum c-dc02-ceph01,c-dc03-ceph01,c-dc01-ceph01 (age 2h) mgr: c-dc02-ceph01.touart(active, since 39h), standbys: c-dc01-ceph01.owmpxa mds: 1/1 daemons up, 2 standby osd: 144 osds: 144 up (since 39h), 144 in (since 4w); 2607 remapped pgs rbd-mirror: 2 daemons active (2 hosts) data: volumes: 0/1 healthy, 1 recovering pools: 15 pools, 9293 pgs objects: 597.03k objects, 2.1 TiB usage: 3.8 TiB used, 248 TiB / 252 TiB avail pgs: 7.576% pgs unknown 64.371% pgs not active 691820/1791087 objects misplaced (38.626%) 5982 peering 2607 active+clean+remapped 704 unknown io: client: 850 B/s rd, 0 op/s rd, 0 op/s wr progress: Global Recovery Event (15h) [=======.....................] (remaining: 4d) root@c-dc01-ceph01:~# ceph version ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable) Thanks, Neil. This email comprises confidential information of Mercedes-Benz Grand Prix Limited ("MGP") unless it contains an explicit statement to the contrary made by an authorised representative of MGP. Contracts may only be concluded on behalf of MGP by its authorised signatories and not solely by email communication. No employee, agent, contractor, consultant and/or other representative of MGP is authorised to conclude any legally binding agreement on behalf of MGP by email alone without the express prior written confirmation of two authorised signatories of MGP. Mercedes-Benz Grand Prix Limited. Registered in England No. 787446. Registered Office at Mercedes-Benz Grand Prix Limited, Operations Centre, Brackley, Northants NN13 7BD. Note: The MGP Legal Department also acts on behalf of Mercedes-Benz Motorsport Limited ("MBM") and the above notice applies mutatis mutandis in respect of all email communications of MBM. MBM: Mercedes-Benz Motorsport Limited. Registered in England No. 13057973. Registered office at Mercedes-Benz Motorsport Limited, Lauda Drive, Brackley, Northants NN13 7BD. Please consider the environment before printing this email. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx