Re: Ceph Monitoring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Just going into production now with a large-ish multisite radosgw setup on 10.2.   We are starting off by alerting on anything that isn't HEALTH_OK, just to see how things go.   If we get HEALTH_WARN but no mons or OSD's are down then it will be a low-level alert.   We will massage scripts to pick up on different conditions.

We're using graphite via collectd for visualization.

    -- Trey


On Fri, Jan 13, 2017 at 3:15 PM, Chris Jones <cjones@xxxxxxxxxxx> wrote:
General question/survey:

Those that have larger clusters, how are you doing alerting/monitoring? Meaning, do you trigger off of 'HEALTH_WARN', etc? Not really talking about collectd related but more on initial alerts of an issue or potential issue? What threshold do you use basically? Just trying to get a pulse of what others are doing.

Thanks in advance.  

--
Best Regards,
Chris Jones
​Bloomberg​




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux