We recently had a few Ceph nodes go offline which required a reboot. I have been able to get the cluster back to the state listed below however it does not seem like it will progress past the point of 23473/287823588 objects misplaced. Yesterday it was about 13% of the data that was misplaced…however this morning it has goteen to 0.008% but has not moved past this point in about an hour. Does anyone see anything in the output below that points to the problem and/or are there any suggestions that I can follow in order to figure out why the cluster health is not moving beyond this point? --------------------------------------------------- root@rbd1:~# ceph -s cluster: id: 504b5794-34bd-44e7-a8c3-0494cf800c23 health: HEALTH_ERR crush map has legacy tunables (require argonaut, min is firefly) 23473/287823588 objects misplaced (0.008%) 14 scrub errors Reduced data availability: 2 pgs inactive Possible data damage: 8 pgs inconsistent services: mon: 3 daemons, quorum hqceph1,hqceph2,hqceph3 mgr: hqceph2(active), standbys: hqceph3 osd: 288 osds: 270 up, 270 in; 2 remapped pgs rgw: 1 daemon active data: pools: 17 pools, 9411 pgs objects: 95.95M objects, 309TiB usage: 936TiB used, 627TiB / 1.53PiB avail pgs: 0.021% pgs not active 23473/287823588 objects misplaced (0.008%) 9369 active+clean 30 active+clean+scrubbing+deep 8 active+clean+inconsistent 2 activating+remapped 2 active+clean+scrubbing io: client: 1000B/s rd, 0B/s wr, 0op/s rd, 0op/s wr root@rbd1:~# ceph health detail HEALTH_ERR crush map has legacy tunables (require argonaut, min is firefly); 1 osds down; 23473/287823588 objects misplaced (0.008%); 14 scrub errors; Reduced data availability: 3 pgs inactive, 13 pgs peering; Possible data damage: 8 pgs inconsistent; Degraded data redundancy: 408658/287823588 objects degraded (0.142%), 38 pgs degraded OLD_CRUSH_TUNABLES crush map has legacy tunables (require argonaut, min is firefly) see http://docs.ceph.com/docs/master/rados/operations/crush-map/#tunables OSD_DOWN 1 osds down osd.95 (root=default,host=hqosd8) is down OBJECT_MISPLACED 23473/287823588 objects misplaced (0.008%) OSD_SCRUB_ERRORS 14 scrub errors PG_AVAILABILITY Reduced data availability: 3 pgs inactive, 13 pgs peering pg 3.b41 is stuck peering for 106.682058, current state peering, last acting [204,190] pg 3.c33 is stuck peering for 103.403643, current state peering, last acting [228,274] pg 3.d15 is stuck peering for 128.537454, current state peering, last acting [286,24] pg 3.fa9 is stuck peering for 106.526146, current state peering, last acting [286,47] pg 3.fb7 is stuck peering for 105.878878, current state peering, last acting [62,97] pg 3.13a2 is stuck peering for 106.491138, current state peering, last acting [270,219] pg 3.1521 is stuck inactive for 170180.165265, current state activating+remapped, last acting [94,186,188] pg 3.1565 is stuck peering for 106.782784, current state peering, last acting [121,60] pg 3.157c is stuck peering for 128.557448, current state peering, last acting [128,268] pg 3.1744 is stuck peering for 106.639603, current state peering, last acting [192,142] pg 3.1ac8 is stuck peering for 127.839550, current state peering, last acting [221,190] pg 3.1e24 is stuck peering for 128.201670, current state peering, last acting [118,158] pg 3.1e46 is stuck inactive for 169121.764376, current state activating+remapped, last acting [87,199,170] pg 18.36 is stuck peering for 128.554121, current state peering, last acting [204] pg 21.1ce is stuck peering for 106.582584, current state peering, last acting [266,192] PG_DAMAGED Possible data damage: 8 pgs inconsistent pg 3.1ca is active+clean+inconsistent, acting [201,8,180] pg 3.56a is active+clean+inconsistent, acting [148,240,8] pg 3.b0f is active+clean+inconsistent, acting [148,260,8] pg 3.b56 is active+clean+inconsistent, acting [218,8,240] pg 3.10ff is active+clean+inconsistent, acting [262,8,211] pg 3.1192 is active+clean+inconsistent, acting [192,8,187] pg 3.124a is active+clean+inconsistent, acting [123,8,222] pg 3.1c55 is active+clean+inconsistent, acting [180,8,287] PG_DEGRADED Degraded data redundancy: 408658/287823588 objects degraded (0.142%), 38 pgs degraded pg 3.8f is active+undersized+degraded, acting [163,149] pg 3.ba is active+undersized+degraded, acting [68,280] pg 3.1aa is active+undersized+degraded, acting [176,211] pg 3.29e is active+undersized+degraded, acting [241,194] pg 3.323 is active+undersized+degraded, acting [78,194] pg 3.343 is active+undersized+degraded, acting [242,144] pg 3.4ae is active+undersized+degraded, acting [153,237] pg 3.524 is active+undersized+degraded, acting [252,222] pg 3.5c9 is active+undersized+degraded, acting [272,252] pg 3.713 is active+undersized+degraded, acting [273,80] pg 3.730 is active+undersized+degraded, acting [235,212] pg 3.88f is active+undersized+degraded, acting [222,285] pg 3.8cb is active+undersized+degraded, acting [285,20] pg 3.9a0 is active+undersized+degraded, acting [240,200] pg 3.c19 is active+undersized+degraded, acting [165,276] pg 3.ec8 is active+undersized+degraded, acting [158,40] pg 3.1025 is active+undersized+degraded, acting [258,274] pg 3.1058 is active+undersized+degraded, acting [38,68] pg 3.14e4 is active+undersized+degraded, acting [185,39] pg 3.150c is active+undersized+degraded, acting [138,140] pg 3.1545 is active+undersized+degraded, acting [222,55] pg 3.15a6 is active+undersized+degraded, acting [242,272] pg 3.1620 is active+undersized+degraded, acting [200,164] pg 3.1710 is active+undersized+degraded, acting [176,285] pg 3.1792 is active+undersized+degraded, acting [190,11] pg 3.17bd is active+undersized+degraded, acting [207,15] pg 3.17da is active+undersized+degraded, acting [5,160] pg 3.183e is active+undersized+degraded, acting [273,136] pg 3.197d is active+undersized+degraded, acting [241,139] pg 3.1a3d is active+undersized+degraded, acting [184,121] pg 3.1ba6 is active+undersized+degraded, acting [47,249] pg 3.1c2b is active+undersized+degraded, acting [268,80] pg 3.1ca2 is active+undersized+degraded, acting [280,152] pg 3.1cd4 is active+undersized+degraded, acting [2,129] pg 3.1e13 is active+undersized+degraded, acting [247,114] pg 12.56 is active+undersized+degraded, acting [54] pg 18.8 is undersized+degraded+peered, acting [260] pg 21.9f is active+undersized+degraded, acting [215,201] -------------------------------------------------------------------------------------------------- Thanks, Shain Shain Miley | Director of Platform and Infrastructure | Digital Media | smiley@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx