udev remove events to mark OSD down/out on disk-pull

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm currently looking at ways to speed up OSD down/out notifications
for disk-pull events, and was investigating using udev remove events
for this.

IIUC, the outage currently propagates through to the mons via OSD device
I/O error -> filestore I/O error ->  ceph-osd ceph_abort() -> heartbeat
failure.

For the disk-pull case, this should be relatively easy to speed up
by handling the remove event in 95-ceph-osd.rules with an appropriate
osd down/out PDU. The problem then becomes maintaining consistent
information in the udev database (all stashed via IMPORT{program}):
- cluster / OSD ids
- appropriate cephx creds

Before I hack something up for this, I'm interested in what others
think, and whether anyone has already gone down this path. I seem to
recall someone attempting to change the ceph-osd behaviour on I/O
error at some stage.

Cheers, David
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux