I think we could find five top OSDs which has the maximum average slow times, as well as five OSDs with absolute maximum time. Should we also be correlating this with SMART data associated with the disk? Some agency has to do the comparison in a storage node and make this available to other nodes to compare with their own data. -Sreenath On 11/22/14, Samuel Just <sam.just@xxxxxxxxxxx> wrote: > The challenge I think is that "slow osd" is probably a global > question. That is, I think it requires the agent to compare a given > osd to the other osds in the cluster (and to itself earlier in time). > -Sam > > On Fri, Nov 21, 2014 at 1:07 PM, Mark Nelson <mark.nelson@xxxxxxxxxxx> > wrote: >> It'd be nice if something like slow OSD detection could exist outside of >> calamari and itself by an event that we record in the logs and make >> available via the admin socket (so that calamari could pick it up). That >> way >> folks could get it into logstash and other system monitoring tools (say >> PCP/Nagios/etc). >> >> Mark >> >> >> On 11/21/2014 02:58 PM, Samuel Just wrote: >>> >>> It's still an open item. #ceph-devel would be a good place to bounce >>> ideas. Through the admin_socket and perf_counter machinery, the osds >>> already expose a bunch of information about queue length, latency, >>> etc. This might actually fit well in calamari, which already gathers >>> a bunch of those stats. >>> -Sam >>> >>> On Thu, Nov 20, 2014 at 9:00 PM, Sreenath BH <bhsreenath@xxxxxxxxx> >>> wrote: >>>> >>>> Hi All >>>> >>>> Slow OSD detection is mentioned as one of the projects ideas in >>>> https://wiki.ceph.com/Development/Project_Ideas >>>> >>>> I am interested in implementing this. Is this still an open item? >>>> >>>> thanks, >>>> Sreenath >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" >>>> in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html