Hi, I'm currently looking at ways to speed up OSD down/out notifications for disk-pull events, and was investigating using udev remove events for this. IIUC, the outage currently propagates through to the mons via OSD device I/O error -> filestore I/O error -> ceph-osd ceph_abort() -> heartbeat failure. For the disk-pull case, this should be relatively easy to speed up by handling the remove event in 95-ceph-osd.rules with an appropriate osd down/out PDU. The problem then becomes maintaining consistent information in the udev database (all stashed via IMPORT{program}): - cluster / OSD ids - appropriate cephx creds Before I hack something up for this, I'm interested in what others think, and whether anyone has already gone down this path. I seem to recall someone attempting to change the ceph-osd behaviour on I/O error at some stage. Cheers, David -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html