Re: "Cannot get stat of OSD" in ceph.mgr.log upon enabling influx plugin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 19, 2018 at 3:07 PM, Benjeman Meekhof <bmeekhof@xxxxxxxxx> wrote:
> The 'cannot stat' messages are normal at startup, we see them also in
> our working setup with mgr influx module.  Maybe they could be fixed
> by delaying the module startup,  or having it check for some other
> 'all good' status but I haven't looked into it.  You should only be
> seeing them when the mgr initially loads.

The log spam was also recently reported as happening when OSDs are
down (https://tracker.ceph.com/issues/23017).  This is annoying, I've
just written a patch for that bug.

John

>
> As far as not getting data, if the self-test works and outputs metrics
> then the module is reading metrics ok from the mgr.  A few things you
> could try:
>
> - Check that the user you setup has rights to the destination
> database, or admin rights to create database if you did not create and
> setup beforehand
> - Increase mgr debug and see if anything is showing up:  ceph tell
> mgr.* injectargs '--debug_mgr 20'    (this will be a lot of logging,
> be sure to reset to 1/5 default)
> - Check that your influx server is getting the traffic:   ' tcpdump -i
> eth1 port 8086 and src host.example '
>
> thanks,
> Ben
>
> On Mon, Feb 19, 2018 at 9:36 AM,  <knawnd@xxxxxxxxx> wrote:
>> Forgot to mentioned that influx self-test produces a reasonable output too
>> (long json list with some metrics and timestamps) as well as there are the
>> following lines in mgr log:
>>
>> 2018-02-19 17:35:04.208858 7f33a50ec700  1 mgr.server reply handle_command
>> (0) Success
>> 2018-02-19 17:35:04.245285 7f33a50ec700  0 log_channel(audit) log [DBG] :
>> from='client.344950 <ceph-mgr-IP>:0/3773014505' entity='client.admin'
>> cmd=[{"prefix": "influx self-test"}]: dispatch
>> 2018-02-19 17:35:04.245314 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer status'
>> 2018-02-19 17:35:04.245319 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer mode'
>> 2018-02-19 17:35:04.245323 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer on'
>> 2018-02-19 17:35:04.245327 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer off'
>> 2018-02-19 17:35:04.245331 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer eval'
>> 2018-02-19 17:35:04.245335 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer eval-verbose'
>> 2018-02-19 17:35:04.245339 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer optimize'
>> 2018-02-19 17:35:04.245343 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer show'
>> 2018-02-19 17:35:04.245347 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer rm'
>> 2018-02-19 17:35:04.245351 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer reset'
>> 2018-02-19 17:35:04.245354 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer dump'
>> 2018-02-19 17:35:04.245358 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'balancer execute'
>> 2018-02-19 17:35:04.245363 7f33a50ec700  1 mgr.server handle_command
>> pyc_prefix: 'influx self-test'
>> 2018-02-19 17:35:04.402782 7f33a58ed700  1 mgr.server reply handle_command
>> (0) Success Self-test OK
>>
>> knawnd@xxxxxxxxx wrote on 19/02/18 17:27:
>>
>>> Dear Ceph users,
>>>
>>> I am trying to enable influx plugin for ceph following
>>> http://docs.ceph.com/docs/master/mgr/influx/ but no data comes to influxdb
>>> DB. As soon as 'ceph mgr module enable influx' command is executed on one of
>>> ceph mgr node (running on CentOS 7.4.1708) there are the following messages
>>> in /var/log/ceph/ceph-mgr.<ceph-mgr-host>.log:
>>>
>>> 2018-02-19 17:11:05.947122 7f33c9b43600  0 ceph version 12.2.2
>>> (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process
>>> (unknown), pid 96425
>>> 2018-02-19 17:11:05.947737 7f33c9b43600  0 pidfile_write: ignore empty
>>> --pid-file
>>> 2018-02-19 17:11:05.986676 7f33c9b43600  1 mgr send_beacon standby
>>> 2018-02-19 17:11:06.003029 7f33c0e2a700  1 mgr init Loading python module
>>> 'balancer'
>>> 2018-02-19 17:11:06.031293 7f33c0e2a700  1 mgr init Loading python module
>>> 'dashboard'
>>> 2018-02-19 17:11:06.119328 7f33c0e2a700  1 mgr init Loading python module
>>> 'influx'
>>> 2018-02-19 17:11:06.220394 7f33c0e2a700  1 mgr init Loading python module
>>> 'restful'
>>> 2018-02-19 17:11:06.398380 7f33c0e2a700  1 mgr init Loading python module
>>> 'status'
>>> 2018-02-19 17:11:06.919109 7f33c0e2a700  1 mgr handle_mgr_map Activating!
>>> 2018-02-19 17:11:06.919454 7f33c0e2a700  1 mgr handle_mgr_map I am now
>>> activating
>>> 2018-02-19 17:11:06.952174 7f33a58ed700  1 mgr load Constructed class from
>>> module: balancer
>>> 2018-02-19 17:11:06.953259 7f33a58ed700  1 mgr load Constructed class from
>>> module: dashboard
>>> 2018-02-19 17:11:06.953959 7f33a58ed700  1 mgr load Constructed class from
>>> module: influx
>>> 2018-02-19 17:11:06.954193 7f33a58ed700  1 mgr load Constructed class from
>>> module: restful
>>> 2018-02-19 17:11:06.955549 7f33a58ed700  1 mgr load Constructed class from
>>> module: status
>>> 2018-02-19 17:11:06.955613 7f33a58ed700  1 mgr send_beacon active
>>> 2018-02-19 17:11:06.960224 7f33a58ed700  1 mgr[restful] Unknown request ''
>>> 2018-02-19 17:11:06.961912 7f33a28e7700  1 mgr[restful] server not
>>> running: no certificate configured
>>> 2018-02-19 17:11:06.969027 7f33a30e8700  0 Cannot get stat of OSD 0
>>> .... on so on for all 64 OSD I have in a cluster.....
>>>
>>> 'ceph osd tree' shows all OSD are up. 'ceph health' gives HEALTH_OK.
>>>
>>> python-influxdb-5.0.0-2.el7.noarch is installed on ceph mgr node. That rpm
>>> was rebuilt from fc28 srpm.
>>>
>>> 'ceph config-key dump|grep influx' shows reasonable info:
>>>      "mgr/influx/database": "ceph_stats",
>>>      "mgr/influx/hostname": "<influxdb host>",
>>>      "mgr/influx/password": "<censored>",
>>>      "mgr/influx/ssl": "false",
>>>      "mgr/influx/username": "cephstat",
>>>      "mgr/influx/verify_ssl": "false"
>>>
>>>
>>> influxdb-1.4.2-1.x86_64 is installed on influxdb host with CentOS
>>> 7.4.1708.
>>>
>>> I would appreciate any help on that issue.
>>>
>>>
>>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux