Monitoring ceph statistics using rados python module

dotalton@xxxxxxxxx (Don Talton (dotalton)) · Tue, 13 May 2014 17:08:03 +0000



python-cephclient may be of some use to you

https://github.com/dmsimard/python-cephclient


> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces at lists.ceph.com] On Behalf Of
> Mike Dawson
> Sent: Tuesday, May 13, 2014 10:04 AM
> To: Adrian Banasiak; Haomai Wang
> Cc: ceph-users at ceph.com
> Subject: Re: Monitoring ceph statistics using rados python module
> 
> Adrian,
> 
> Yes, it is single OSD oriented.
> 
> Like Haomai, we monitor perf dumps from individual OSD admin sockets. On
> new enough versions of ceph, you can do 'ceph daemon osd.x perf dump',
> which is a shorter way to ask for the same output as 'ceph --admin-daemon
> /var/run/ceph/ceph-osd.x.asok perf dump'. Keep in mind, either version has to
> be run locally on the host where osd.x is running.
> 
> We use Sensu to take samples and push them to Graphite. We have the ability
> to then build dashboards showing the whole cluster, units in our CRUSH tree,
> hosts, or an individual OSDs.
> 
> I have found that monitoring each OSD's admin daemon is critical. Often times
> a single OSD can affect performance of the entire cluster. Without individual
> data, these types of issues can be quite difficult to pinpoint.
> 
> Also, note that Inktank has developed Calamari. There are rumors that it may
> be open sourced at some point in the future.
> 
> Cheers,
> Mike Dawson
> 
> 
> On 5/13/2014 12:33 PM, Adrian Banasiak wrote:
> > Thanks for sugestion with admin daemon but it looks like single osd
> > oriented. I have used perf dump on mon socket and it output some
> > interesting data in case of monitoring whole cluster:
> > { "cluster": { "num_mon": 4,
> >        "num_mon_quorum": 4,
> >        "num_osd": 29,
> >        "num_osd_up": 29,
> >        "num_osd_in": 29,
> >        "osd_epoch": 1872,
> >        "osd_kb": 20218112516,
> >        "osd_kb_used": 5022202696,
> >        "osd_kb_avail": 15195909820,
> >        "num_pool": 4,
> >        "num_pg": 3500,
> >        "num_pg_active_clean": 3500,
> >        "num_pg_active": 3500,
> >        "num_pg_peering": 0,
> >        "num_object": 400746,
> >        "num_object_degraded": 0,
> >        "num_object_unfound": 0,
> >        "num_bytes": 1678788329609,
> >        "num_mds_up": 0,
> >        "num_mds_in": 0,
> >        "num_mds_failed": 0,
> >        "mds_epoch": 1},
> >
> > Unfortunately cluster wide IO statistics are still missing.
> >
> >
> > 2014-05-13 17:17 GMT+02:00 Haomai Wang <haomaiwang at gmail.com
> > <mailto:haomaiwang at gmail.com>>:
> >
> >     Not sure your demand.
> >
> >     I use "ceph --admin-daemon /var/run/ceph/ceph-osd.x.asok perf dump" to
> >     get the monitor infos. And the result can be parsed by simplejson
> >     easily via python.
> >
> >     On Tue, May 13, 2014 at 10:56 PM, Adrian Banasiak
> >     <adrian at banasiak.it <mailto:adrian at banasiak.it>> wrote:
> >      > Hi, i am working with test Ceph cluster and now I want to
> >     implement Zabbix
> >      > monitoring with items such as:
> >      >
> >      > - whoe cluster IO (for example ceph -s -> recovery io 143 MB/s, 35
> >      > objects/s)
> >      > - pg statistics
> >      >
> >      > I would like to create single script in python to retrive values
> >     using rados
> >      > python module, but there are only few informations in
> >     documentation about
> >      > module usage. I've created single function which calculates all pools
> >      > current read/write statistics but i cant find out how to add
> >     recovery IO
> >      > usage and pg statistics:
> >      >
> >      >     read = 0
> >      >     write = 0
> >      >     for pool in conn.list_pools():
> >      >         io = conn.open_ioctx(pool)
> >      >         stats[pool] = io.get_stats()
> >      >         read+=int(stats[pool]['num_rd'])
> >      >         write+=int(stats[pool]['num_wr'])
> >      >
> >      > Could someone share his knowledge about rados module for
> >     retriving ceph
> >      > statistics?
> >      >
> >      > BTW Ceph is awesome!
> >      >
> >      > --
> >      > Best regards, Adrian Banasiak
> >      > email: adrian at banasiak.it <mailto:adrian at banasiak.it>
> >      >
> >      > _______________________________________________
> >      > ceph-users mailing list
> >      > ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
> >      > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >      >
> >
> >
> >
> >     --
> >     Best Regards,
> >
> >     Wheat
> >
> >
> >
> >
> > --
> > Pozdrawiam, Adrian Banasiak
> > email: adrian at banasiak.it <mailto:adrian at banasiak.it>
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users at lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com