On Mon, Mar 26, 2012 at 12:36, Sage Weil <sage@xxxxxxxxxxxx> wrote: > Currently when you shutdown/kill a ceph-osd it is no different from it > crashing: you have to wait N seconds for its peers to conclude the process > is down before the OSD is deemed 'failed' and the osd map is updated. > > This would be pretty easy to improve on: > > - on a clean shutdown (e.g., due to SIGTERM), we could execv a call to > the ceph tool to tell the monitors the osd stopped (maybe with a > 'reason' and nice log message). > > - on an unclean shutdown (e.g., failed assert, segfault) we can > do the same, with an appropriate message in the system log A clean shutdown can send the "This location is now defunct" message itself, execve seems to be extra just complications for it. For daemon crashes, perhaps the next run, after upstart/etc restarts the daemon, can somehow convince others proactively that the new osd.42 is better than the old osd.42. That sounds like a good feature to have.. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html