> The reason it is so long is that you don't want to move data > around unnecessarily if the osd is just being rebooted/restarted. I think you're confusing down with out. When an OSD is out, Ceph backfills. While it is merely down, Ceph hopes that it will come back. But it will direct I/O to other redundant OSDs instead of a down one. Going down leads to going out, and I believe that is the 600 seconds you mention - the time between when the OSD is marked down and when Ceph marks it out (if all other conditions permit). There is a pretty good explanation of how OSDs get marked down, which is pretty complicated, at http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/ It just doesn't seem to match the implementation. -- Bryan Henderson San Jose, California _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com