Hi,
I got those metrics back after setting:
reef01:~ # ceph config set mgr mgr/prometheus/exclude_perf_counters false
reef01:~ # curl http://localhost:9283/metrics | grep ceph_osd_op | head
% Total % Received % Xferd Average Speed Time Time
Time Current
Dload Upload Total Spent Left Speed
100 324k 100 324k 0 0 72.5M 0 --:--:-- --:--:-- --:--:-- 79.1M
# HELP ceph_osd_op Client operations
# TYPE ceph_osd_op counter
ceph_osd_op{ceph_daemon="osd.0"} 139650.0
ceph_osd_op{ceph_daemon="osd.11"} 9711090.0
ceph_osd_op{ceph_daemon="osd.2"} 3864.0
ceph_osd_op{ceph_daemon="osd.1"} 25.0
ceph_osd_op{ceph_daemon="osd.4"} 543.0
ceph_osd_op{ceph_daemon="osd.5"} 12192.0
ceph_osd_op{ceph_daemon="osd.3"} 3661521.0
ceph_osd_op{ceph_daemon="osd.6"} 2030.0
I found the option in the docs [1], but the same section is in the
quincy docs as well, although there's no such option in my quincy
cluster, maybe that's why it still exports those performance counters
in my quincy cluster:
quincy-1:~ # ceph config get mgr mgr/prometheus/exclude_perf_counters
Error ENOENT: unrecognized key 'mgr/prometheus/exclude_perf_counters'
Anyway, this should bring back the metrics the "legacy" way (I guess).
Apparently, the ceph-exporter daemon is now required on your hosts to
collect those metrics.
After adding the ceph-exporter service (ceph orch apply ceph-exporter)
and setting mgr/prometheus/exclude_perf_counters back to "true" I see
that there are "ceph_osd_op" metrics defined but no values yet.
Apparently, I'm still missing something, I'll check tomorrow. But this
could/should be in the upgrade docs IMO.
Regards,
Eugen
[1]
https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-counters-metrics
Zitat von Martin <ceph@xxxxxxxxxxxxx>:
Hi,
Confirmed that this happens to me as well.
After upgrading from 18.2.0 to 18.2.1 OSD metrics
like: ceph_osd_op_* are missing from ceph-mgr.
The Grafana dashboard also doesn't display all graphs correctly.
ceph-dashboard/Ceph - Cluster : Capacity used, Cluster I/O, OSD
Capacity Utilization, PGs per OSD....
curl http://localhost:9283/metrics | grep -i ceph_osd_op
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 38317 100 38317 0 0 9.8M 0 --:--:-- --:--:--
--:--:-- 12.1M
Before the upgrading to reef 18.2.1 I could get all the metrics.
Martin
On 18/01/2024 12:32, Jose Vicente wrote:
Hi,
After upgrading from Quincy to Reef the ceph-mgr daemon is not
throwing some throughput OSD metrics like: ceph_osd_op_*
curl http://localhost:9283/metrics | grep -i ceph_osd_op
% Total % Received % Xferd Average Speed Time Time
Time Current
Dload Upload Total Spent Left Speed
100 295k 100 295k 0 0 144M 0 --:--:-- --:--:--
--:--:-- 144M
However I can get other metrics like:
# curl http://localhost:9283/metrics | grep -i ceph_osd_apply
# HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms
# TYPE ceph_osd_apply_latency_ms gauge
ceph_osd_apply_latency_ms{ceph_daemon="osd.275"} 152.0
ceph_osd_apply_latency_ms{ceph_daemon="osd.274"} 102.0
...
Before the upgrading to reef (from quincy) I I could get all the
metrics. MGR module prometheus is enabled.
Rocky Linux release 8.8 (Green Obsidian)
ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable)
# netstat -nap | grep 9283
tcp 0 0 127.0.0.1:53834 127.0.0.1:9283
ESTABLISHED 3561/prometheus
tcp6 0 0 :::9283 :::* LISTEN
804985/ceph-mgr
Thanks,
Jose C.
_______________________________________________
ceph-users mailing list --ceph-users@xxxxxxx
To unsubscribe send an email toceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx