Hello, We had a network hiccup with a Ceph cluster and it made several of our osds go out/down. After the network was fixed the osds remain down. We have restarted them in numerous ways and they won’t come up. The logs for the down osds just repeat this line over and over "tick checking mon for new map”. There are osds on the same host that are up so there is connectivity between the osds and mons. Any advice on where to look for a resolution is appreciated. Thanks, Nathan Cluster was built with cephadm Ceph Quincy - 17.2.6 Docker version 23.0.2, build 569dd73 Ubuntu 20.04.6 LTS cluster: id: aa39fa2a-1510-11ee-953a-bd804ec1ea33 health: HEALTH_ERR Failed to apply 1 service(s): nfs.secstorage 1 filesystem is degraded 1 MDSs report slow metadata IOs Module 'cephadm' has failed: Command '['rados', '-n', 'mgr.cphprodc1-11.uuuhug', '-k', '/var/lib/ceph/mgr/ceph-cphprodc1-11.uuuhug/keyring', '-p', '.nfs', '--namespace', 'secstorage', 'rm', 'grace']' timed out after 10 seconds 28 osds down Reduced data availability: 36 pgs stale 2 daemons have recently crashed 1 mgr modules have recently crashed 945514 slow ops, oldest one blocked for 66804 sec, daemons [mon.cphprodc1-10,mon.cphprodc1-11,mon.cphprodc1-13] have slow ops. services: mon: 4 daemons, quorum cphprodc1-10,cphprodc1-11,cphprodc1-12,cphprodc1-13 (age 2h) mgr: cphprodc1-11.uuuhug(active, since 23h), standbys: cphprodc1-10.upwvbg mds: 1/1 daemons up, 1 standby osd: 64 osds: 19 up (since 2d), 47 in (since 23h) data: volumes: 0/1 healthy, 1 recovering pools: 5 pools, 113 pgs objects: 151.91k objects, 592 GiB usage: 840 GiB used, 81 TiB / 82 TiB avail pgs: 65 active+clean 36 stale+active+clean 7 active+clean+scrubbing 5 active+clean+scrubbing+deep osd tree: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 111.78223 root default -5 27.94556 host cphprodc1-10 1 ssd 1.74660 osd.1 down 1.00000 1.00000 5 ssd 1.74660 osd.5 down 1.00000 1.00000 8 ssd 1.74660 osd.8 down 1.00000 1.00000 12 ssd 1.74660 osd.12 down 1.00000 1.00000 14 ssd 1.74660 osd.14 down 1.00000 1.00000 18 ssd 1.74660 osd.18 down 0 1.00000 22 ssd 1.74660 osd.22 down 0 1.00000 26 ssd 1.74660 osd.26 down 0 1.00000 30 ssd 1.74660 osd.30 down 0 1.00000 34 ssd 1.74660 osd.34 down 1.00000 1.00000 37 ssd 1.74660 osd.37 down 1.00000 1.00000 41 ssd 1.74660 osd.41 down 1.00000 1.00000 45 ssd 1.74660 osd.45 up 1.00000 1.00000 48 ssd 1.74660 osd.48 up 1.00000 1.00000 52 ssd 1.74660 osd.52 up 1.00000 1.00000 56 ssd 1.74660 osd.56 up 1.00000 1.00000 -7 27.94556 host cphprodc1-11 2 ssd 1.74660 osd.2 down 0 1.00000 6 ssd 1.74660 osd.6 down 1.00000 1.00000 10 ssd 1.74660 osd.10 down 1.00000 1.00000 16 ssd 1.74660 osd.16 down 1.00000 1.00000 20 ssd 1.74660 osd.20 down 0 1.00000 24 ssd 1.74660 osd.24 down 0 1.00000 28 ssd 1.74660 osd.28 down 0 1.00000 32 ssd 1.74660 osd.32 down 0 1.00000 36 ssd 1.74660 osd.36 down 1.00000 1.00000 40 ssd 1.74660 osd.40 down 1.00000 1.00000 44 ssd 1.74660 osd.44 down 1.00000 1.00000 50 ssd 1.74660 osd.50 up 1.00000 1.00000 54 ssd 1.74660 osd.54 up 1.00000 1.00000 58 ssd 1.74660 osd.58 up 1.00000 1.00000 60 ssd 1.74660 osd.60 up 1.00000 1.00000 62 ssd 1.74660 osd.62 up 1.00000 1.00000 -3 27.94556 host cphprodc1-12 0 ssd 1.74660 osd.0 down 1.00000 1.00000 4 ssd 1.74660 osd.4 down 1.00000 1.00000 7 ssd 1.74660 osd.7 down 1.00000 1.00000 11 ssd 1.74660 osd.11 down 1.00000 1.00000 15 ssd 1.74660 osd.15 down 1.00000 1.00000 19 ssd 1.74660 osd.19 down 0 1.00000 23 ssd 1.74660 osd.23 down 0 1.00000 27 ssd 1.74660 osd.27 down 0 1.00000 31 ssd 1.74660 osd.31 down 0 1.00000 35 ssd 1.74660 osd.35 down 1.00000 1.00000 38 ssd 1.74660 osd.38 down 1.00000 1.00000 42 ssd 1.74660 osd.42 down 1.00000 1.00000 46 ssd 1.74660 osd.46 up 1.00000 1.00000 49 ssd 1.74660 osd.49 up 1.00000 1.00000 53 ssd 1.74660 osd.53 up 1.00000 1.00000 57 ssd 1.74660 osd.57 up 1.00000 1.00000 -9 27.94556 host cphprodc1-13 3 ssd 1.74660 osd.3 down 1.00000 1.00000 9 ssd 1.74660 osd.9 down 1.00000 1.00000 13 ssd 1.74660 osd.13 down 1.00000 1.00000 17 ssd 1.74660 osd.17 down 1.00000 1.00000 21 ssd 1.74660 osd.21 down 0 1.00000 25 ssd 1.74660 osd.25 down 0 1.00000 29 ssd 1.74660 osd.29 down 0 1.00000 33 ssd 1.74660 osd.33 down 0 1.00000 39 ssd 1.74660 osd.39 down 1.00000 1.00000 43 ssd 1.74660 osd.43 down 1.00000 1.00000 47 ssd 1.74660 osd.47 up 1.00000 1.00000 51 ssd 1.74660 osd.51 up 1.00000 1.00000 55 ssd 1.74660 osd.55 up 1.00000 1.00000 59 ssd 1.74660 osd.59 up 1.00000 1.00000 61 ssd 1.74660 osd.61 up 1.00000 1.00000 63 ssd 1.74660 osd.63 up 1.00000 1.00000 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx