Hi all, a lot of OSDs crashed in our cluster. Mimic 13.2.8. Current status included below. All daemons are running, no OSD process crashed. Can I start marking OSDs in and up to get them back talking to each other? Please advice on next steps. Thanks!! [root@gnosis ~]# ceph status cluster: id: e4ece518-f2cb-4708-b00f-b6bf511e91d9 health: HEALTH_WARN 2 MDSs report slow metadata IOs 1 MDSs report slow requests nodown,noout,norecover flag(s) set 125 osds down 3 hosts (48 osds) down Reduced data availability: 2221 pgs inactive, 1943 pgs down, 190 pgs peering, 13 pgs stale Degraded data redundancy: 5134396/500993581 objects degraded (1.025%), 296 pgs degraded, 299 pgs undersized 9622 slow ops, oldest one blocked for 2913 sec, daemons [osd.0,osd.100,osd.101,osd.112,osd.118,osd.133,osd.136,osd.142,osd.144,osd.145]... have slow ops. services: mon: 3 daemons, quorum ceph-01,ceph-02,ceph-03 mgr: ceph-02(active), standbys: ceph-03, ceph-01 mds: con-fs2-1/1/1 up {0=ceph-08=up:active}, 1 up:standby-replay osd: 288 osds: 90 up, 215 in; 230 remapped pgs flags nodown,noout,norecover data: pools: 10 pools, 2545 pgs objects: 62.61 M objects, 144 TiB usage: 219 TiB used, 1.6 PiB / 1.8 PiB avail pgs: 1.729% pgs unknown 85.540% pgs not active 5134396/500993581 objects degraded (1.025%) 1796 down 226 active+undersized+degraded 147 down+remapped 140 peering 65 active+clean 44 unknown 38 undersized+degraded+peered 38 remapped+peering 17 active+undersized+degraded+remapped+backfill_wait 12 stale+peering 12 active+undersized+degraded+remapped+backfilling 4 active+undersized+remapped 2 remapped 2 undersized+degraded+remapped+peered 1 stale 1 undersized+degraded+remapped+backfilling+peered io: client: 26 KiB/s rd, 206 KiB/s wr, 21 op/s rd, 50 op/s wr [root@gnosis ~]# ceph health detail HEALTH_WARN 2 MDSs report slow metadata IOs; 1 MDSs report slow requests; nodown,noout,norecover flag(s) set; 125 osds down; 3 hosts (48 osds) down; Reduced data availability: 2219 pgs inactive, 1943 pgs down, 188 pgs peering, 13 pgs stale; Degraded data redundancy: 5214696/500993589 objects degraded (1.041%), 298 pgs degraded, 299 pgs undersized; 9788 slow ops, oldest one blocked for 2953 sec, daemons [osd.0,osd.100,osd.101,osd.112,osd.118,osd.133,osd.136,osd.142,osd.144,osd.145]... have slow ops. MDS_SLOW_METADATA_IO 2 MDSs report slow metadata IOs mdsceph-08(mds.0): 100+ slow metadata IOs are blocked > 30 secs, oldest blocked for 2940 secs mdsceph-12(mds.0): 1 slow metadata IOs are blocked > 30 secs, oldest blocked for 2942 secs MDS_SLOW_REQUEST 1 MDSs report slow requests mdsceph-08(mds.0): 100 slow requests are blocked > 30 secs OSDMAP_FLAGS nodown,noout,norecover flag(s) set OSD_DOWN 125 osds down osd.0 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-21) is down osd.6 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.7 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.8 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.16 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.18 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.19 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.21 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.31 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-18) is down osd.37 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-04) is down osd.38 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-07) is down osd.48 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-04) is down osd.51 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-22) is down osd.53 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-21) is down osd.55 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-19) is down osd.62 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.67 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.72 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-21) is down osd.75 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.78 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.79 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.80 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.81 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.82 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-14) is down osd.83 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.88 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.89 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.92 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.93 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.95 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.96 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.97 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.100 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.104 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.105 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.107 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.108 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.109 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.111 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-14) is down osd.113 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.114 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-09) is down osd.116 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.117 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.119 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.122 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.123 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.124 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.125 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-09) is down osd.126 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.128 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.131 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.134 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.139 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.140 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.141 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.145 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-04) is down osd.149 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.151 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-09) is down osd.152 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.153 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.154 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-14) is down osd.155 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.156 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-04) is down osd.157 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-05) is down osd.159 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-07) is down osd.161 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-09) is down osd.162 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.164 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.165 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.166 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.167 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-14) is down osd.171 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.172 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-07) is down osd.174 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.176 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.177 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-13) is down osd.179 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.182 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-06) is down osd.183 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-07) is down osd.184 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.186 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.187 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.190 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-14) is down osd.191 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-15) is down osd.194 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.195 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.196 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.199 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.200 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.201 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.202 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.203 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.204 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.208 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.210 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-08) is down osd.212 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.213 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.214 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-09) is down osd.215 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-10) is down osd.216 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.218 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-09) is down osd.219 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-11) is down osd.221 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-12) is down osd.224 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-16) is down osd.226 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1,host=ceph-17) is down osd.228 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-20) is down osd.230 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-20) is down osd.233 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-19) is down osd.236 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-19) is down osd.238 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-18) is down osd.247 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-21) is down osd.248 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-18) is down osd.254 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-04) is down osd.256 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-04) is down osd.259 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-18) is down osd.260 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-20) is down osd.262 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-19) is down osd.266 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-18) is down osd.267 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-18) is down osd.272 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-20) is down osd.274 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-21) is down osd.275 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-19) is down osd.276 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-22) is down osd.281 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-22) is down osd.285 (root=DTU,region=Risoe,datacenter=ServerRoom,room=SR-113,host=ceph-05) is down OSD_HOST_DOWN 3 hosts (48 osds) down host ceph-11 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1) (16 osds) is down host ceph-10 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1) (16 osds) is down host ceph-13 (root=DTU,region=Risoe,datacenter=ContainerSquare,room=CON-161A1) (16 osds) is down PG_AVAILABILITY Reduced data availability: 2219 pgs inactive, 1943 pgs down, 188 pgs peering, 13 pgs stale pg 14.513 is stuck inactive for 1681.564244, current state down, last acting [2147483647,2147483647,2147483647,2147483647,2147483647,143,2147483647,2147483647,2147483647,2147483647] pg 14.514 is down, acting [193,2147483647,2147483647,2147483647,2147483647,118,2147483647,2147483647,2147483647,2147483647] pg 14.515 is down, acting [2147483647,2147483647,2147483647,211,133,135,2147483647,2147483647,2147483647,2147483647] pg 14.516 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,205,2147483647] pg 14.517 is down, acting [2147483647,2147483647,5,2147483647,2147483647,2147483647,2147483647,2147483647,61,112] pg 14.518 is down, acting [2147483647,198,2147483647,2147483647,2147483647,2147483647,4,185,2147483647,2147483647] pg 14.519 is down, acting [2147483647,2147483647,68,2147483647,2147483647,2147483647,2147483647,185,2147483647,94] pg 14.51a is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,101,2147483647] pg 14.51b is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,197,2147483647,2147483647,2147483647,2147483647] pg 14.51c is down, acting [193,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,197] pg 14.51d is down, acting [2147483647,2147483647,61,2147483647,77,2147483647,2147483647,2147483647,112,2147483647] pg 14.51e is down, acting [2147483647,2147483647,2147483647,2147483647,112,2147483647,2147483647,193,2147483647,2147483647] pg 14.51f is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,94,2147483647,2147483647] pg 14.520 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,207,2147483647,101,133,2147483647] pg 14.521 is down, acting [205,2147483647,133,2147483647,2147483647,2147483647,2147483647,4,2147483647,193] pg 14.522 is down, acting [101,2147483647,2147483647,11,197,2147483647,136,94,2147483647,2147483647] pg 14.523 is down, acting [2147483647,2147483647,2147483647,118,2147483647,71,2147483647,2147483647,2147483647,2147483647] pg 14.524 is down, acting [2147483647,111,2147483647,2147483647,2147483647,8,2147483647,112,2147483647,2147483647] pg 14.525 is down, acting [2147483647,2147483647,2147483647,142,2147483647,61,2147483647,2147483647,2147483647,2147483647] pg 14.526 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,61,193,2147483647,2147483647,2147483647] pg 14.527 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,109,2147483647,2147483647] pg 14.528 is down, acting [2147483647,133,2147483647,2147483647,2147483647,2147483647,4,2147483647,2147483647,2147483647] pg 14.529 is down, acting [2147483647,112,2147483647,2147483647,2147483647,2147483647,185,2147483647,118,2147483647] pg 14.52a is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,136,2147483647,135,2147483647,2147483647] pg 14.52b is down, acting [2147483647,2147483647,2147483647,112,142,211,2147483647,2147483647,2147483647,2147483647] pg 14.52c is down, acting [185,2147483647,198,2147483647,118,2147483647,2147483647,2147483647,2147483647,2147483647] pg 14.52d is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,5,2147483647,2147483647,2147483647] pg 14.52e is down, acting [71,101,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,142,2147483647] pg 14.52f is down, acting [198,2147483647,2147483647,2147483647,2147483647,11,2147483647,2147483647,118,2147483647] pg 14.530 is down, acting [142,2147483647,2147483647,2147483647,133,2147483647,2147483647,2147483647,2147483647,112] pg 14.531 is down, acting [2147483647,142,2147483647,2147483647,2147483647,185,2147483647,2147483647,2147483647,2147483647] pg 14.532 is down, acting [135,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,136,118] pg 14.533 is down, acting [2147483647,77,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647] pg 14.534 is down, acting [2147483647,2147483647,2147483647,185,118,2147483647,2147483647,207,2147483647,2147483647] pg 14.535 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,136,142,133,2147483647] pg 14.536 is down, acting [2147483647,11,2147483647,2147483647,136,2147483647,2147483647,2147483647,2147483647,2147483647] pg 14.537 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,77,2147483647] pg 14.538 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,205,2147483647,2147483647] pg 14.539 is down, acting [2147483647,2147483647,2147483647,198,2147483647,2147483647,4,2147483647,2147483647,2147483647] pg 14.53a is down, acting [2147483647,11,136,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647] pg 14.53b is down, acting [2147483647,2147483647,2147483647,2147483647,112,2147483647,2147483647,2147483647,2147483647,2147483647] pg 14.53c is down, acting [2147483647,2147483647,2147483647,71,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647] pg 14.53d is down, acting [2147483647,2147483647,2147483647,185,2147483647,2147483647,2147483647,2147483647,2147483647,136] pg 14.53e is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,112,185] pg 14.53f is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,185,2147483647,2147483647,2147483647] pg 14.540 is down, acting [205,2147483647,2147483647,2147483647,2147483647,2147483647,142,2147483647,112,77] pg 14.541 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,197,211,2147483647,2147483647,2147483647] pg 14.542 is down, acting [112,2147483647,101,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647] pg 14.543 is down, acting [111,2147483647,2147483647,2147483647,2147483647,101,2147483647,2147483647,2147483647,2147483647] pg 14.544 is down, acting [4,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,2147483647,205] pg 14.545 is down, acting [2147483647,2147483647,2147483647,2147483647,2147483647,142,5,2147483647,2147483647,2147483647] PG_DEGRADED Degraded data redundancy: 5214696/500993589 objects degraded (1.041%), 298 pgs degraded, 299 pgs undersized pg 1.29 is stuck undersized for 2075.633328, current state active+undersized+degraded, last acting [253,258] pg 1.2a is stuck undersized for 1642.864920, current state active+undersized+degraded, last acting [252,255] pg 1.2b is stuck undersized for 2355.149928, current state active+undersized+degraded+remapped+backfill_wait, last acting [240,268] pg 1.2c is stuck undersized for 1459.277329, current state active+undersized+degraded, last acting [241,273] pg 1.2d is stuck undersized for 803.339131, current state undersized+degraded+peered, last acting [282] pg 2.25 is active+undersized+degraded, acting [253,2147483647,2147483647,258,261,273,277,243] pg 2.28 is stuck undersized for 803.340163, current state active+undersized+degraded, last acting [282,241,246,2147483647,273,252,2147483647,268] pg 2.29 is stuck undersized for 803.341160, current state active+undersized+degraded, last acting [240,258,277,264,2147483647,2147483647,271,250] pg 2.2a is stuck undersized for 1447.684978, current state active+undersized+degraded+remapped+backfilling, last acting [252,270,2147483647,261,2147483647,255,287,264] pg 2.2e is stuck undersized for 2030.849944, current state active+undersized+degraded, last acting [264,2147483647,251,245,257,286,261,258] pg 2.51 is stuck undersized for 1459.274671, current state active+undersized+degraded+remapped+backfilling, last acting [270,2147483647,2147483647,265,241,243,240,252] pg 2.52 is stuck undersized for 2030.850897, current state active+undersized+degraded+remapped+backfilling, last acting [240,2147483647,270,265,269,280,278,2147483647] pg 2.53 is stuck undersized for 1459.273517, current state active+undersized+degraded, last acting [261,2147483647,280,282,2147483647,245,243,241] pg 2.61 is stuck undersized for 2075.633140, current state active+undersized+degraded+remapped+backfilling, last acting [269,2147483647,258,286,270,255,2147483647,264] pg 2.62 is stuck undersized for 803.340577, current state active+undersized+degraded, last acting [2147483647,253,258,2147483647,250,287,264,284] pg 2.66 is stuck undersized for 803.341231, current state active+undersized+degraded, last acting [264,280,265,255,257,269,2147483647,270] pg 2.6c is stuck undersized for 963.369539, current state active+undersized+degraded, last acting [286,269,278,251,2147483647,273,2147483647,280] pg 2.70 is stuck undersized for 873.662725, current state active+undersized+degraded, last acting [2147483647,268,255,273,253,265,278,2147483647] pg 2.74 is stuck undersized for 2075.632312, current state active+undersized+degraded+remapped+backfilling, last acting [240,242,2147483647,245,243,269,2147483647,265] pg 3.24 is stuck undersized for 1570.800184, current state active+undersized+degraded, last acting [235,263] pg 3.25 is stuck undersized for 733.673503, current state undersized+degraded+peered, last acting [232] pg 3.28 is stuck undersized for 2610.307886, current state active+undersized+degraded, last acting [263,84] pg 3.2a is stuck undersized for 1214.710839, current state active+undersized+degraded, last acting [181,232] pg 3.2b is stuck undersized for 2075.630671, current state active+undersized+degraded, last acting [63,144] pg 3.52 is stuck undersized for 1570.777598, current state active+undersized+degraded, last acting [158,237] pg 3.54 is stuck undersized for 1350.257189, current state active+undersized+degraded, last acting [239,74] pg 3.55 is stuck undersized for 2592.642531, current state active+undersized+degraded, last acting [157,233] pg 3.5a is stuck undersized for 2075.608257, current state undersized+degraded+peered, last acting [168] pg 3.5c is stuck undersized for 733.674836, current state active+undersized+degraded, last acting [263,234] pg 3.5d is stuck undersized for 2610.307220, current state active+undersized+degraded, last acting [180,84] pg 3.5e is stuck undersized for 1710.756037, current state undersized+degraded+peered, last acting [146] pg 3.61 is stuck undersized for 1080.210021, current state active+undersized+degraded, last acting [168,239] pg 3.62 is stuck undersized for 831.217622, current state active+undersized+degraded, last acting [84,263] pg 3.63 is stuck undersized for 733.674204, current state active+undersized+degraded, last acting [263,232] pg 3.65 is stuck undersized for 1570.790824, current state active+undersized+degraded, last acting [63,84] pg 3.66 is stuck undersized for 733.682973, current state undersized+degraded+peered, last acting [63] pg 3.68 is stuck undersized for 1570.624462, current state active+undersized+degraded, last acting [229,148] pg 3.69 is stuck undersized for 1350.316213, current state undersized+degraded+peered, last acting [235] pg 3.6b is stuck undersized for 783.813654, current state undersized+degraded+peered, last acting [63] pg 3.6c is stuck undersized for 783.819083, current state undersized+degraded+peered, last acting [229] pg 3.6f is stuck undersized for 2610.321349, current state active+undersized+degraded, last acting [232,158] pg 3.72 is stuck undersized for 1350.358149, current state active+undersized+degraded, last acting [229,74] pg 3.73 is stuck undersized for 1570.788310, current state undersized+degraded+peered, last acting [234] pg 11.20 is stuck undersized for 733.682510, current state active+undersized+degraded, last acting [2147483647,239,87,2147483647,158,237,63,76] pg 11.26 is stuck undersized for 1914.334332, current state active+undersized+degraded, last acting [2147483647,237,2147483647,263,158,148,181,180] pg 11.2d is stuck undersized for 1350.365988, current state active+undersized+degraded, last acting [2147483647,2147483647,73,229,86,158,169,84] pg 11.54 is stuck undersized for 1914.398125, current state active+undersized+degraded, last acting [231,169,2147483647,229,84,85,237,63] pg 11.5b is stuck undersized for 2047.980719, current state active+undersized+degraded, last acting [86,237,168,263,144,1,229,2147483647] pg 11.5e is stuck undersized for 873.643661, current state active+undersized+degraded, last acting [181,2147483647,229,158,231,1,169,2147483647] pg 11.62 is stuck undersized for 1144.491696, current state active+undersized+degraded, last acting [2147483647,85,235,74,63,234,181,2147483647] pg 11.6f is stuck undersized for 873.646628, current state active+undersized+degraded, last acting [234,3,2147483647,158,180,63,2147483647,181] SLOW_OPS 9788 slow ops, oldest one blocked for 2953 sec, daemons [osd.0,osd.100,osd.101,osd.112,osd.118,osd.133,osd.136,osd.142,osd.144,osd.145]... have slow ops. ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx