Good morning everyone! Today there was an atypical situation in our Cluster where the three machines came to shut down. On powering up the cluster went up and formed quorum with no problems, but the PGs are all in Working, I don't see any disk activity on the machines. No PG is active. [ceph: root@dcs1 /]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 98.24359 root default -3 32.74786 host dcs1 0 hdd 2.72899 osd.0 up 1.00000 1.00000 1 hdd 2.72899 osd.1 up 1.00000 1.00000 2 hdd 2.72899 osd.2 up 1.00000 1.00000 3 hdd 2.72899 osd.3 up 1.00000 1.00000 4 hdd 2.72899 osd.4 up 1.00000 1.00000 5 hdd 2.72899 osd.5 up 1.00000 1.00000 6 hdd 2.72899 osd.6 up 1.00000 1.00000 7 hdd 2.72899 osd.7 up 1.00000 1.00000 8 hdd 2.72899 osd.8 up 1.00000 1.00000 9 hdd 2.72899 osd.9 up 1.00000 1.00000 10 hdd 2.72899 osd.10 up 1.00000 1.00000 11 hdd 2.72899 osd.11 up 1.00000 1.00000 -5 32.74786 host dcs2 12 hdd 2.72899 osd.12 up 1.00000 1.00000 13 hdd 2.72899 osd.13 up 1.00000 1.00000 14 hdd 2.72899 osd.14 up 1.00000 1.00000 15 hdd 2.72899 osd.15 up 1.00000 1.00000 16 hdd 2.72899 osd.16 up 1.00000 1.00000 17 hdd 2.72899 osd.17 up 1.00000 1.00000 18 hdd 2.72899 osd.18 up 1.00000 1.00000 19 hdd 2.72899 osd.19 up 1.00000 1.00000 20 hdd 2.72899 osd.20 up 1.00000 1.00000 21 hdd 2.72899 osd.21 up 1.00000 1.00000 22 hdd 2.72899 osd.22 up 1.00000 1.00000 23 hdd 2.72899 osd.23 up 1.00000 1.00000 -7 32.74786 host dcs3 24 hdd 2.72899 osd.24 up 1.00000 1.00000 25 hdd 2.72899 osd.25 up 1.00000 1.00000 26 hdd 2.72899 osd.26 up 1.00000 1.00000 27 hdd 2.72899 osd.27 up 1.00000 1.00000 28 hdd 2.72899 osd.28 up 1.00000 1.00000 29 hdd 2.72899 osd.29 up 1.00000 1.00000 30 hdd 2.72899 osd.30 up 1.00000 1.00000 31 hdd 2.72899 osd.31 up 1.00000 1.00000 32 hdd 2.72899 osd.32 up 1.00000 1.00000 33 hdd 2.72899 osd.33 up 1.00000 1.00000 34 hdd 2.72899 osd.34 up 1.00000 1.00000 35 hdd 2.72899 osd.35 up 1.00000 1.00000 [ceph: root@dcs1 /]# ceph -s cluster: id: 58bbb950-538b-11ed-b237-2c59e53b80cc health: HEALTH_WARN 4 filesystems are degraded 4 MDSs report slow metadata IOs Reduced data availability: 1153 pgs inactive, 1101 pgs peering 26 slow ops, oldest one blocked for 563 sec, daemons [osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]... have slow ops. services: mon: 3 daemons, quorum dcs1.evocorp,dcs2,dcs3 (age 7m) mgr: dcs1.evocorp.kyqfcd(active, since 15m), standbys: dcs2.rirtyl mds: 4/4 daemons up, 4 standby osd: 36 osds: 36 up (since 6m), 36 in (since 47m); 65 remapped pgs data: volumes: 0/4 healthy, 4 recovering pools: 10 pools, 1153 pgs objects: 254.72k objects, 994 GiB usage: 2.8 TiB used, 95 TiB / 98 TiB avail pgs: 100.000% pgs not active 1036 peering 65 remapped+peering 52 activating [ceph: root@dcs1 /]# ceph health detail HEALTH_WARN 4 filesystems are degraded; 4 MDSs report slow metadata IOs; Reduced data availability: 1153 pgs inactive, 1101 pgs peering; 26 slow ops, oldest one blocked for 673 sec, daemons [osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]... have slow ops. [WRN] FS_DEGRADED: 4 filesystems are degraded fs dc_ovirt is degraded fs dc_iso is degraded fs dc_sas is degraded fs pool_tester is degraded [WRN] MDS_SLOW_METADATA_IO: 4 MDSs report slow metadata IOs mds.dc_sas.dcs1.wbyuik(mds.0): 4 slow metadata IOs are blocked > 30 secs, oldest blocked for 1063 secs mds.dc_ovirt.dcs1.lpcazs(mds.0): 4 slow metadata IOs are blocked > 30 secs, oldest blocked for 1058 secs mds.pool_tester.dcs1.ixkkfs(mds.0): 4 slow metadata IOs are blocked > 30 secs, oldest blocked for 1058 secs mds.dc_iso.dcs1.jxqqjd(mds.0): 4 slow metadata IOs are blocked > 30 secs, oldest blocked for 1058 secs [WRN] PG_AVAILABILITY: Reduced data availability: 1153 pgs inactive, 1101 pgs peering pg 6.c3 is stuck inactive for 50m, current state peering, last acting [30,15,11] pg 6.c4 is stuck peering for 10h, current state peering, last acting [12,0,26] pg 6.c5 is stuck peering for 10h, current state peering, last acting [12,32,6] pg 6.c6 is stuck peering for 11h, current state peering, last acting [30,4,22] pg 6.c7 is stuck peering for 10h, current state peering, last acting [4,14,26] pg 6.c8 is stuck peering for 10h, current state peering, last acting [0,22,32] pg 6.c9 is stuck peering for 11h, current state peering, last acting [32,20,0] pg 6.ca is stuck peering for 11h, current state peering, last acting [31,0,23] pg 6.cb is stuck peering for 10h, current state peering, last acting [8,35,16] pg 6.cc is stuck peering for 10h, current state peering, last acting [8,24,13] pg 6.cd is stuck peering for 10h, current state peering, last acting [15,25,1] pg 6.ce is stuck peering for 11h, current state peering, last acting [27,23,4] pg 6.cf is stuck peering for 11h, current state peering, last acting [25,4,20] pg 7.c4 is stuck peering for 11m, current state remapped+peering, last acting [19,8] pg 7.c5 is stuck peering for 10h, current state peering, last acting [6,14,32] pg 7.c6 is stuck peering for 10h, current state peering, last acting [14,35,5] pg 7.c7 is stuck peering for 10h, current state remapped+peering, last acting [11,14] pg 7.c8 is stuck peering for 10h, current state peering, last acting [21,9,28] pg 7.c9 is stuck peering for 10h, current state peering, last acting [0,30,15] pg 7.ca is stuck peering for 10h, current state peering, last acting [23,2,26] pg 7.cb is stuck peering for 10h, current state peering, last acting [23,9,24] pg 7.cc is stuck peering for 10h, current state peering, last acting [23,27,0] pg 7.cd is stuck peering for 11m, current state remapped+peering, last acting [13,6] pg 7.ce is stuck peering for 10h, current state peering, last acting [16,1,25] pg 7.cf is stuck peering for 11h, current state peering, last acting [24,16,8] pg 9.c0 is stuck peering for 10h, current state peering, last acting [21,28] pg 9.c1 is stuck peering for 10h, current state peering, last acting [12,31] pg 9.c2 is stuck peering for 10h, current state peering, last acting [6,27] pg 9.c3 is stuck peering for 10h, current state peering, last acting [9,27] pg 9.c4 is stuck peering for 50m, current state peering, last acting [17,34] pg 9.c5 is stuck peering for 11h, current state peering, last acting [31,8] pg 9.c6 is stuck peering for 10h, current state peering, last acting [1,29] pg 9.c7 is stuck peering for 10h, current state peering, last acting [12,30] pg 9.c8 is stuck peering for 11h, current state peering, last acting [26,3] pg 9.c9 is stuck peering for 11h, current state peering, last acting [29,13] pg 9.ca is stuck peering for 11h, current state peering, last acting [25,6] pg 9.cb is stuck peering for 10h, current state peering, last acting [16,9] pg 9.cc is stuck peering for 4h, current state peering, last acting [4,29] pg 10.c0 is stuck peering for 11h, current state peering, last acting [32,19] pg 10.c1 is stuck peering for 10h, current state peering, last acting [23,6] pg 10.c2 is stuck peering for 11h, current state peering, last acting [24,7] pg 10.c3 is stuck peering for 38m, current state peering, last acting [5,20] pg 10.c4 is stuck peering for 10h, current state peering, last acting [21,4] pg 10.c5 is stuck peering for 10h, current state peering, last acting [12,8] pg 10.c6 is stuck peering for 11h, current state peering, last acting [34,7] pg 10.c7 is stuck peering for 10h, current state peering, last acting [17,30] pg 10.c8 is stuck peering for 11h, current state peering, last acting [24,19] pg 10.c9 is stuck inactive for 54m, current state activating, last acting [13,3] pg 10.ca is stuck peering for 10h, current state peering, last acting [16,6] pg 10.cb is stuck peering for 11h, current state peering, last acting [26,13] pg 10.cf is stuck peering for 50m, current state peering, last acting [21,24] [WRN] SLOW_OPS: 26 slow ops, oldest one blocked for 673 sec, daemons [osd.10,osd.13,osd.14,osd.15,osd.16,osd.18,osd.20,osd.21,osd.24,osd.25]... have slow ops. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx