Re: osd shutdown notification

Tommi Virtanen <tommi.virtanen@xxxxxxxxxxxxx> · Mon, 26 Mar 2012 12:49:16 -0700

On Mon, Mar 26, 2012 at 12:36, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> Currently when you shutdown/kill a ceph-osd it is no different from it
> crashing: you have to wait N seconds for its peers to conclude the process
> is down before the OSD is deemed 'failed' and the osd map is updated.
>
> This would be pretty easy to improve on:
>
>  - on a clean shutdown (e.g., due to SIGTERM), we could execv a call to
>   the ceph tool to tell the monitors the osd stopped (maybe with a
>   'reason' and nice log message).
>
>  - on an unclean shutdown (e.g., failed assert, segfault) we can
>   do the same, with an appropriate message in the system log

A clean shutdown can send the "This location is now defunct" message
itself, execve seems to be extra just complications for it.

For daemon crashes, perhaps the next run, after upstart/etc restarts
the daemon, can somehow convince others proactively that the new
osd.42 is better than the old osd.42. That sounds like a good feature
to have..
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html