Hi, Here it is: # cephadm shell -- ceph status Using recent ceph image 172.16.3.146:4000/ceph/ceph:v15.2.9 cluster: id: 3cdbf59a-a74b-11ea-93cc-f0d4e2e6643c health: HEALTH_WARN 2 failed cephadm daemon(s) services: mon: 3 daemons, quorum spsrc-mon-1,spsrc-mon-2,spsrc-mon-3 (age 7d) mgr: spsrc-mon-1.eziiam(active, since 7d), standbys: spsrc-mon-2.ilbncj, spsrc-mon-3.vzwxfr mds: manila:1 {0=manila.spsrc-mon-2.syveaq=up:active} 2 up:standby osd: 248 osds: 248 up (since 2w), 248 in (since 3M) data: pools: 6 pools, 257 pgs objects: 4.77M objects, 5.9 TiB usage: 12 TiB used, 1.3 PiB / 1.3 PiB avail pgs: 257 active+clean Also: # cephadm shell -- ceph health detail Using recent ceph image 172.16.3.146:4000/ceph/ceph:v15.2.9 HEALTH_WARN 2 failed cephadm daemon(s) [WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s) daemon mon.spsrc-mon-1-safe on spsrc-mon-1 is in error state daemon mon.spsrc-mon-2-safe on spsrc-mon-2 is in error state I don't think these containers are crucial, right? I did ask a while ago: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/MQM46KBC3BNACYZWW37CGUHMNLTZQUTF/ All 3 Ceph monitor nodes report that "systemctl status ceph\*.service" are ok. Here are the commands I tried to inspect the logs: grep -i health -r /var/log/ceph/ grep -i error -r /var/log/ceph/ I get: ceph_volume.exceptions.ConfigurationError: Unable to load expected Ceph config at: /etc/ceph/ceph.conf But I think that's expected in a containerised deployment? Do you suggest other commands? Many thanks, Sebastian On Wed, 19 May 2021 at 21:49, Eugen Block <eblock@xxxxxx> wrote: > Hi, > > can you paste the ceph status? > The orchestrator is a MGR module, have you checked if the containers > are up and running (assuming it’s cephadm based)? Do the logs also > report the cluster as healthy? > > Zitat von Sebastian Luna Valero <sebastian.luna.valero@xxxxxxxxx>: > > > Hi, > > > > After an unschedule power outage our Ceph (Octopus) cluster reports a > > healthy state with: "ceph status". However, when we run "ceph orch > status" > > the command hangs forever. > > > > Are there other commands that we can run for a more thorough health check > > of the cluster? > > > > After looking at: > > https://docs.ceph.com/en/octopus/rados/operations/health-checks/ > > > > I also run "ceph crash ls-new" but it hangs forever as well. > > > > Any ideas? > > > > Our Ceph cluster is currently used as backend storage for our OpenStack > > cluster, and we are also having issues with storage volumes attached to > > VMs, but we don't know how to narrow down the root cause. > > > > Any feedback is highly appreciated. > > > > Best regards, > > Sebastian > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx