Hi Guys, We are experiencing some OSD crashing issues recently, like messenger crash, some strange crash (still being investigating), etc. Those crashes seems not to reproduce after restarting OSD. So we are thinking about the strategy of auto-restarting crashed OSD for 1 or 2 times, then leave it as down if restarting doesn't work. This strategy might help us on pg peering and recovering impact to online traffic to some extent, since we won't mark OSD out automatically even if it is down unless we are sure it is disk failure. However, we are also aware that this strategy may bring us some problems. Since your guys have more experience on CEPH, so we would like to hear some suggestions from you. Thanks. David Zhang _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com