I can see the is ok files are there
root@ceph1:/var/run/ceph# ls -la
total 0
drwxrwx--- 2 ceph ceph 80 Feb 1 10:51 .
drwxr-xr-x 18 root root 640 Feb 1 10:52 ..
srwxr-xr-x 1 ceph ceph 0 Feb 1 10:51 ceph-mon.ceph1.asok
srwxr-xr-x 1 root root 0 Jan 27 15:08 ceph-osd.0.asok
root@ceph1:/var/run/ceph#
root@ceph1:/var/run/ceph#
root@ceph1:/var/run/ceph#
Running diamond in debug show the below
[2016-02-01 10:55:23,774] [Thread-1] Collecting data from: NetworkCollector
[2016-02-01 10:56:23,484] [Thread-1] Collecting data from: CPUCollector
[2016-02-01 10:56:23,487] [Thread-6] Collecting data from: MemoryCollector
[2016-02-01 10:56:23,489] [Thread-7] Collecting data from: SockstatCollector
[2016-02-01 10:56:23,768] [Thread-1] Collecting data from: CephCollector
[2016-02-01 10:56:23,768] [Thread-1] gathering service stats for /var/run/ceph/ceph-mon.ceph1.asok
[2016-02-01 10:56:24,094] [Thread-1] Traceback (most recent call last):
File "/usr/lib/pymodules/python2.7/diamond/collector.py", line 412, in _run
self.collect()
File "/usr/share/diamond/collectors/ceph/ceph.py", line 464, in collect
self._collect_service_stats(path)
File "/usr/share/diamond/collectors/ceph/ceph.py", line 450, in _collect_service_stats
self._publish_stats(counter_prefix, stats, schema, GlobalName)
File "/usr/share/diamond/collectors/ceph/ceph.py", line 305, in _publish_stats
assert path[-1] == 'type'
AssertionError
[2016-02-01 10:56:24,096] [Thread-8] Collecting data from: LoadAverageCollector
[2016-02-01 10:56:24,098] [Thread-1] Collecting data from: VMStatCollector
[2016-02-01 10:56:24,099] [Thread-1] Collecting data from: DiskUsageCollector
[2016-02-01 10:56:24,104] [Thread-9] Collecting data from: DiskSpaceCollector
Check the md5 on the file returns the below:
root@ceph1:/var/run/ceph# md5sum /usr/share/diamond/collectors/ceph/ceph.py
aeb3915f8ac7fdea61495805d2c99f33 /usr/share/diamond/collectors/ceph/ceph.py
root@ceph1:/var/run/ceph#
I've found that replacing the ceph.py file with the below stops the diamond error
root@ceph1:/usr/share/diamond/collectors/ceph# md5sum ceph.py
13ac74ce0df39a5def879cb5fc530015 ceph.py
[2016-02-01 11:14:33,116] [Thread-42] Collecting data from: MemoryCollector
[2016-02-01 11:14:33,117] [Thread-1] Collecting data from: CPUCollector
[2016-02-01 11:14:33,123] [Thread-43] Collecting data from: SockstatCollector
[2016-02-01 11:14:35,453] [Thread-1] Collecting data from: CephCollector
[2016-02-01 11:14:35,454] [Thread-1] checking /var/run/ceph/ceph-mon.ceph1.asok
[2016-02-01 11:14:35,552] [Thread-1] checking /var/run/ceph/ceph-osd.0.asok
[2016-02-01 11:14:35,685] [Thread-44] Collecting data from: LoadAverageCollector
[2016-02-01 11:14:35,686] [Thread-1] Collecting data from: VMStatCollector
[2016-02-01 11:14:35,687] [Thread-1] Collecting data from: DiskUsageCollector
[2016-02-01 11:14:35,692] [Thread-45] Collecting data from: DiskSpaceCollector
But after all that it's still NOT working
What diamond version are you running ?
I'm running Diamond version 3.4.67
On Mon, Feb 1, 2016 at 11:01 PM, Daniel Rolfe <daniel.rolfe.au@xxxxxxxxx> wrote:
I can see the is ok files are thereroot@ceph1:/var/run/ceph# ls -latotal 0drwxrwx--- 2 ceph ceph 80 Feb 1 10:51 .drwxr-xr-x 18 root root 640 Feb 1 10:52 ..srwxr-xr-x 1 ceph ceph 0 Feb 1 10:51 ceph-mon.ceph1.asoksrwxr-xr-x 1 root root 0 Jan 27 15:08 ceph-osd.0.asokroot@ceph1:/var/run/ceph#root@ceph1:/var/run/ceph#root@ceph1:/var/run/ceph#Running diamond in debug show the below[2016-02-01 10:55:23,774] [Thread-1] Collecting data from: NetworkCollector[2016-02-01 10:56:23,484] [Thread-1] Collecting data from: CPUCollector[2016-02-01 10:56:23,487] [Thread-6] Collecting data from: MemoryCollector[2016-02-01 10:56:23,489] [Thread-7] Collecting data from: SockstatCollector[2016-02-01 10:56:23,768] [Thread-1] Collecting data from: CephCollector[2016-02-01 10:56:23,768] [Thread-1] gathering service stats for /var/run/ceph/ceph-mon.ceph1.asok[2016-02-01 10:56:24,094] [Thread-1] Traceback (most recent call last):File "/usr/lib/pymodules/python2.7/diamond/collector.py", line 412, in _runself.collect()File "/usr/share/diamond/collectors/ceph/ceph.py", line 464, in collectself._collect_service_stats(path)File "/usr/share/diamond/collectors/ceph/ceph.py", line 450, in _collect_service_statsself._publish_stats(counter_prefix, stats, schema, GlobalName)File "/usr/share/diamond/collectors/ceph/ceph.py", line 305, in _publish_statsassert path[-1] == 'type'AssertionError[2016-02-01 10:56:24,096] [Thread-8] Collecting data from: LoadAverageCollector[2016-02-01 10:56:24,098] [Thread-1] Collecting data from: VMStatCollector[2016-02-01 10:56:24,099] [Thread-1] Collecting data from: DiskUsageCollector[2016-02-01 10:56:24,104] [Thread-9] Collecting data from: DiskSpaceCollectorCheck the md5 on the file returns the below:root@ceph1:/var/run/ceph# md5sum /usr/share/diamond/collectors/ceph/ceph.pyaeb3915f8ac7fdea61495805d2c99f33 /usr/share/diamond/collectors/ceph/ceph.pyroot@ceph1:/var/run/ceph#I've found that replacing the ceph.py file with the below stops the diamond errorroot@ceph1:/usr/share/diamond/collectors/ceph# md5sum ceph.py13ac74ce0df39a5def879cb5fc530015 ceph.py[2016-02-01 11:14:33,116] [Thread-42] Collecting data from: MemoryCollector[2016-02-01 11:14:33,117] [Thread-1] Collecting data from: CPUCollector[2016-02-01 11:14:33,123] [Thread-43] Collecting data from: SockstatCollector[2016-02-01 11:14:35,453] [Thread-1] Collecting data from: CephCollector[2016-02-01 11:14:35,454] [Thread-1] checking /var/run/ceph/ceph-mon.ceph1.asok[2016-02-01 11:14:35,552] [Thread-1] checking /var/run/ceph/ceph-osd.0.asok[2016-02-01 11:14:35,685] [Thread-44] Collecting data from: LoadAverageCollector[2016-02-01 11:14:35,686] [Thread-1] Collecting data from: VMStatCollector[2016-02-01 11:14:35,687] [Thread-1] Collecting data from: DiskUsageCollector[2016-02-01 11:14:35,692] [Thread-45] Collecting data from: DiskSpaceCollectorBut after all that it's still now workingWhat diamond version are you running ?I'm running Diamond version 3.4.67On Mon, Feb 1, 2016 at 12:24 PM, hnuzhoulin <hnuzhoulin2@xxxxxxxxx> wrote:Yes,in my environment I fix it.BTW,I check the md5 of ceph collection file.It is correct.在 Sun, 31 Jan 2016 22:46:42 +0800,Daniel Rolfe <daniel.rolfe.au@xxxxxxxxx> 写道:Hi, thanks for the replyJust to confirm , did you manage to fix this issue ?I've restarted the whole ceph cluster a few times.
Sent from my iPhoneI just face the same problem.The problem is my cluster missing the asok files of mons although the cluster works well.so kill mon process and restart it may fix it.(using service command to restart mon daemon may do not work)在 Sun, 31 Jan 2016 10:35:25 +0800,Daniel Rolfe <daniel.rolfe.au@xxxxxxxxx> 写道:Seem to be having an issue with global ceph stats getting back to calamariIndividual node and osd stats are workingIf anyone can point me into the right direction that would be great---------------------------
hnuzhoulin2@xxxxxxxxx---------------------------
hnuzhoulin@xxxxxxxxx
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com