On Wed, 1 Apr 2015, John Spray wrote: > I guess in this interesting case you could either: > * Allow other OSDs on the same host to handle the 'tell blink' command for > the dead OSD's drive > * Leave this to calamari/whoever to read the dead OSD's block device path > from "ceph osd metadata", and go blink the LEDs themselves. #2 really sounds safer to me. In particular, you need to be really careful not to flash an LED until you're sure you don't need the data on the disk (i.e., it's down+out and the cluster state is healthy--no heroic measures needed). I think anything that triggers flashing that doesn't have a holistic view of the cluster would be dangerous. That, combined with the complications around ceph-osd possibly not running, make me thing this would be the calamari agent that does the flashing. It also may be necessary for the disk -> last known state mapping to go somewhere other than in just osd metadata; if the osd is recreated or the id gets reused that info go away. (We could also be careful to avoid deallocating the id until the disk is removed, I guess, but it's another constraint to worry about.) sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html