Re: udev remove events to mark OSD down/out on disk-pull

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 16 Nov 2016 14:50:30 +0000 (UTC), Sage Weil wrote:

> On Wed, 16 Nov 2016, David Disseldorp wrote:
> > Hi,
> > 
> > I'm currently looking at ways to speed up OSD down/out notifications
> > for disk-pull events, and was investigating using udev remove events
> > for this.
> > 
> > IIUC, the outage currently propagates through to the mons via OSD device
> > I/O error -> filestore I/O error ->  ceph-osd ceph_abort() -> heartbeat
> > failure.  
> 
> We just merged (post-jewel) a change that makes connection refused events 
> trigger an immediate mark-down of the peer OSD.  I think this will have 
> the same effect, as long as the ceph-osd process is killed in a timely 
> manner.  Have you tried it?  I'd suggest making sure that it's not 
> sufficient before investing too much time into a udev-based approach...
> 
> See a033dc6f5b4cef357db6f5951062d680e880ba0e

Looks much cleaner than handling this in udev. I'll test this with
Jewel and follow up - thanks Sage!

Cheers, David
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux