Re: Aggregate failure report in ceph -s

Sage Weil <sweil@xxxxxxxxxx> · Fri, 20 Nov 2015 03:27:56 -0800 (PST)

On Fri, 20 Nov 2015, Chen, Xiaoxi wrote:
> 
> Hi Sage,
> 
>        As we are looking at the failure detection part of ceph(basically
> around osd flipping issue), we  got some suggestion from customer that
> showing the aggregated failure report in ?ceph ?s?. The idea is:
> 
>       When an OSD find it cannot hear heartbeat from some of the peers, it
> will try to aggregate the failure domain, say ?I cannot reach all my peers
> in Rack C,    something wrong??  and this kind of log will be showed on ceph
> ?s.   So if we see ceph ?s and notice a lot of complain saying cannot reach
> Rack C, we will easily diagnose the Rack C has some network issue.
> 
>  
> 
>       Is that make sense?

Yeah, sounds reasonable to me!  It's a bit more awkward to do this at the 
mon level since rack C may talk to the mon, but doing it at the OSD makes 
sense.  There will be a lot of heuristics involved, though.  I expect the 
messages might include

- cannot reach _% of peers outside of my $crushlevel $foo [on front|back]
- cannot reach _% of hosts in $crushlevel $foo [on front|back]

?

Also note that it would be easiest to log these in the cluster log (ceph 
-w, not ceph -s).. I'm guessing that's what you mean?

Thanks!
sage