hi,all we encountered some problem origin cluster OSD0……OSD23 we add OSD24……OSD27 the OSD20 log: 2011-09-28 10:32:50.602820 7f63498b6700 osd20 27 update_heartbeat_peers: new _from osd24 192.168.0.118:6802/10487 2011-09-28 10:32:50.602831 7f63498b6700 -- 192.168.0.116:6802/10666 --> 192.168.0.118:6802/10487 -- osd_ping(e0 as_of 27 request_heartbeat) v1 -- ?+0 0x45a08c0 con 0x44863c0 the OSD24 log: 2011-09-28 10:13:23.325257 7f1c33c99700 osd24 25 advance to epoch 26 (<= newest 27) 2011-09-28 10:13:23.325261 7f1c33c99700 osd24 25 get_map 26 - cached 0x156f000 2011-09-28 10:13:23.325268 7f1c33c99700 osd24 26 advance_map epoch 26 0 pgs 2011-09-28 10:13:23.325273 7f1c33c99700 osd24 26 get_map 25 - cached 0x14d1c00 2011-09-28 10:13:23.325279 7f1c33c99700 osd24 26 advance to epoch 27 (<= newest 27) 2011-09-28 10:13:23.325282 7f1c33c99700 osd24 26 get_map 27 - cached 0x156f300 2011-09-28 10:13:23.325288 7f1c33c99700 osd24 27 advance_map epoch 27 0 pgs 2011-09-28 10:13:23.325292 7f1c33c99700 osd24 27 get_map 26 - cached 0x156f000 2011-09-28 10:13:23.325298 7f1c33c99700 osd24 27 activate_map version 27 2011-09-28 10:16:43.576857 7f1c33c99700 osd24 28 advance to epoch 29 (<= newest 29) 2011-09-28 10:16:43.576868 7f1c33c99700 osd24 28 get_map 29 - cached 0x156f300 2011-09-28 10:16:43.576894 7f1c33c99700 osd24 29 advance_map epoch 29 27 pgs i can not figure out why there is 0 pgs when OSD24 get osdmap of epoch 27? but the OSD20 really regard the OSD24 as the new heartbeat_from peer at the epoch 27? so this will result the OSD20 wronly marked OSD24 down. is it a normal operation to marked down the timeout osd? thanks! -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html