Hi, Check your zabbix binary and zabbix server network reachability, the mgr call for zabbix_sender but exit code is bad: "/usr/bin/zabbix_sender exited non-zero" k Sent from my iPhone > On 20 Oct 2021, at 00:46, shubjero <shubjero@xxxxxxxxx> wrote: > > Hey all, > > Recently upgraded to Ceph Octopus (15.2.14). We also run Zabbix > 5.0.15. Have had ceph/zabbix monitoring for a long time. After the > Ceph Octopus update I installed the latest version of the Ceph > template in Zabbix > (https://github.com/ceph/ceph/blob/master/src/pybind/mgr/zabbix/zabbix_template.xml). > > Zabbix is successfully getting metrics for all the items in the items > list in my 'ceph' zabbix host. The ceph zabbix host is configured by > fsid so that any of my 3 ceph-mgr's can send data to it via uuid. > > Here's the ceph zabbix config: > > { > "discovery_interval": 100, > "identifier": "4a158d27-f750-41d5-9e7f-26ce4c9d2d45", > "interval": 60, > "log_level": "", > "log_to_cluster": false, > "log_to_cluster_level": "info", > "log_to_file": false, > "zabbix_host": "172.25.4.20", > "zabbix_port": 10051, > "zabbix_sender": "/usr/bin/zabbix_sender" > } > > But for some reason when I run 'ceph zabbix send' or 'ceph zabbix > discover' I get the following errors: > > # ceph zabbix send > Failed to send data to Zabbix > # ceph zabbix discovery > Failed to send discovery data to Zabbix > > And the ceph logs are constantly logging zabbix errors: > # ceph log last > 2021-10-19T17:40:00.005371-0400 mon.controller1 (mon.0) 682609 : > cluster [INF] overall HEALTH_OK > 2021-10-19T17:40:04.347459-0400 mon.controller1 (mon.0) 682611 : > cluster [WRN] Health check failed: Failed to send data to Zabbix > (MGR_ZABBIX_SEND_FAILED) > 2021-10-19T17:40:05.352579-0400 mon.controller1 (mon.0) 682612 : > cluster [INF] Health check cleared: MGR_ZABBIX_SEND_FAILED (was: > Failed to send data to Zabbix) > 2021-10-19T17:40:05.352611-0400 mon.controller1 (mon.0) 682613 : > cluster [INF] Cluster is now healthy > 2021-10-19T17:41:06.196293-0400 mon.controller1 (mon.0) 682647 : > cluster [WRN] Health check failed: Failed to send data to Zabbix > (MGR_ZABBIX_SEND_FAILED) > 2021-10-19T17:41:07.260666-0400 mon.controller1 (mon.0) 682649 : > cluster [INF] Health check cleared: MGR_ZABBIX_SEND_FAILED (was: > Failed to send data to Zabbix) > 2021-10-19T17:41:07.260689-0400 mon.controller1 (mon.0) 682650 : > cluster [INF] Cluster is now healthy > > I've tried setting debug_mgr and debug_mon to 20/20 to look for > additional detail but I didn't see much more other than: > > 2021-10-19T17:15:27.042-0400 7f2c6c50d700 7 > mon.controller1@0(leader).log v30689480 update_from_paxos applying > incremental log 30689480 2021-10-19T17:15:26.604054-0400 > mon.controller3 (mon.2) 42876 : audit [DBG] from='mgr.490501944 > 172.25.12.17:0/3421653' entity='mgr.controller1' cmd=[{"prefix": > "config-key get", "key": "mgr/zabbix/zabbix_host"}]: dispatch > "MGR_ZABBIX_SEND_FAILED": { > "message": "Failed to send data to Zabbix", > "message": "/usr/bin/zabbix_sender exited non-zero: b''" > > > If anyone has any tips for troubleshooting that would be greatly appreciated! > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx