Hi, This cluster is deployed by cephadm 17.2.5,containerized. It ends up in this(no active mgr): [root@8cd2c0657c77 /]# ceph -s cluster: id: ad3a132e-e9ee-11ed-8a19-043f72fb8bf9 health: HEALTH_WARN 6 hosts fail cephadm check no active mgr 1/3 mons down, quorum h18w,h19w Degraded data redundancy: 781908/2345724 objects degraded (33.333%), 101 pgs degraded, 209 pgs undersized services: mon: 3 daemons, quorum h18w,h19w (age 19m), out of quorum: h15w mgr: no daemons active (since 5h) mds: 1/1 daemons up, 1 standby osd: 9 osds: 6 up (since 5h), 6 in (since 5h) rgw: 2 daemons active (2 hosts, 1 zones) data: volumes: 1/1 healthy pools: 8 pools, 209 pgs objects: 781.91k objects, 152 GiB usage: 312 GiB used, 54 TiB / 55 TiB avail pgs: 781908/2345724 objects degraded (33.333%) 108 active+undersized 101 active+undersized+degraded I checked the h20w, there is a manager container running with log: debug 2023-05-10T12:43:23.315+0000 7f5e152ec000 0 monclient(hunting): authenticate timed out after 300 debug 2023-05-10T12:48:23.318+0000 7f5e152ec000 0 monclient(hunting): authenticate timed out after 300 debug 2023-05-10T12:53:23.318+0000 7f5e152ec000 0 monclient(hunting): authenticate timed out after 300 debug 2023-05-10T12:58:23.319+0000 7f5e152ec000 0 monclient(hunting): authenticate timed out after 300 debug 2023-05-10T13:03:23.319+0000 7f5e152ec000 0 monclient(hunting): authenticate timed out after 300 debug 2023-05-10T13:08:23.319+0000 7f5e152ec000 0 monclient(hunting): authenticate timed out after 300 debug 2023-05-10T13:13:23.319+0000 7f5e152ec000 0 monclient(hunting): authenticate timed out after 300 any idea to get a mgr up running again through cephadm? Thanks, Ben _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx