Hi Eugen,
I will take another look at the templating Redo mentioned, so feel free to ignore this for now. Since you asked, however, I wanted to answer your questions and send the requested screenshots.
We have 70 OSDs, as in the first screenshot. 10 are 1TB NVMe’s in our five management nodes, and 60 are large 20TB SATA drives in our five osd nodes, as in the second screenshot. We have two crush_rules that are used to put our cephfs.data pool on the SATA drives, and the cephfs.meta, .mgr, and .nfs pools on the NVMe drives (screenshot 3).
The fourth screenshot shows the alerts. If I click through them, the following OSDs are mentioned (a mix of NVMe’s and SATA):
64, 68, 62, 57, 67, 63, 6, 56, 28, 37, 20, 52, 11, 31, 35, 15, 1, 50, 66, 65, 40, 21, 60, 69, 61, 42
Lastly, the fifth and sixth screenshots shows the alert rules I see in the dashboard.
Thanks again for your time and help.
Sincerely,
Devin
I will take another look at the templating Redo mentioned, so feel free to ignore this for now. Since you asked, however, I wanted to answer your questions and send the requested screenshots.
We have 70 OSDs, as in the first screenshot. 10 are 1TB NVMe’s in our five management nodes, and 60 are large 20TB SATA drives in our five osd nodes, as in the second screenshot. We have two crush_rules that are used to put our cephfs.data pool on the SATA drives, and the cephfs.meta, .mgr, and .nfs pools on the NVMe drives (screenshot 3).
The fourth screenshot shows the alerts. If I click through them, the following OSDs are mentioned (a mix of NVMe’s and SATA):
64, 68, 62, 57, 67, 63, 6, 56, 28, 37, 20, 52, 11, 31, 35, 15, 1, 50, 66, 65, 40, 21, 60, 69, 61, 42
Lastly, the fifth and sixth screenshots shows the alert rules I see in the dashboard.
Thanks again for your time and help.
Sincerely,
Devin






> On Jan 14, 2025, at 2:44 AM, Eugen Block <eblock@xxxxxx> wrote:
>
> Ah, I checked on a newer test cluster (Squid) and now I see what you mean. The alert is shown per OSD in the dashboard, if you open the dropdown you see which daemons are affected. I think it works a bit different in Pacific (that's what the customer is still running) when I last had to modify this. How many OSDs do you have? I noticed that it takes a few seconds for prometheus to clear the warning with only 3 OSDs in my lab cluster. Maybe you could share a screenshot (with redacted sensitive data) showing the alerts? And the status of the affected OSDs as well.
>
>
> Zitat von "Devin A. Bougie" <devin.bougie@xxxxxxxxxxx>:
>
>> Hi Eugen,
>>
>> No, as far as I can tell I only have one prometheus service running.
>>
>> ———
>>
>> [root@cephman2 ~]# ceph orch ls prometheus --export
>>
>> service_type: prometheus
>>
>> service_name: prometheus
>>
>> placement:
>>
>> count: 1
>>
>> label: _admin
>>
>>
>> [root@cephman2 ~]# ceph orch ps --daemon-type prometheus
>>
>> NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
>>
>> prometheus.cephman2 cephman2.classe.cornell.edu *:9095 running (12h) 4m ago 3w 350M - 2.43.0 a07b618ecd1d 5a8d88682c28
>>
>> ———
>>
>> Anything else I can check or do?
>>
>> Thanks,
>> Devin
>>
>> On Jan 13, 2025, at 6:39 PM, Eugen Block <eblock@xxxxxx> wrote:
>>
>> Do you have two Prometheus instances? Maybe you could share
>> ceph orch ls prometheus --export
>>
>> Or alternatively:
>> ceph orch ps --daemon-type prometheus
>>
>> You can use two instances for HA, but then you need to change the threshold for both, of course.
>>
>> Zitat von "Devin A. Bougie" <devin.bougie@xxxxxxxxxxx<mailto:devin.bougie@xxxxxxxxxxx>>:
>>
>> Thanks, Eugen! Just incase you have any more suggestions, this still isn’t quite working for us.
>>
>> Perhaps one clue is that in the Alerts view of the cephadm dashboard, every alert is listed twice. We see two CephPGImbalance alerts, both set to 30% after redeploying the service. If I then follow your procedure, one of the alerts updates to 50% as configured, but the other stays at 30. Is it normal to see each alert listed twice, or did I somehow make a mess of things when trying to change the default alerts?
>>
>> No problem if it’s not an obvious answer, we can live with and ignore the spurious CephPGImbalance alerts.
>>
>> Thanks again,
>> Devin
>>
>> On Jan 7, 2025, at 2:14 AM, Eugen Block <eblock@xxxxxx> wrote:
>>
>> Hi,
>>
>> sure thing, here's the diff how I changed it to 50% deviation instead of 30%:
>>
>> ---snip---
>> diff -u /var/lib/ceph/{FSID}/prometheus.host1/etc/prometheus/alerting/ceph_alerts.yml /var/lib/ceph/{FSID}/prometheus.host1/etc/prometheus/alerting/ceph_alerts.yml.dist
>> --- /var/lib/ceph/{FSID}/prometheus.host1/etc/prometheus/alerting/ceph_alerts.yml 2024-12-17 10:03:23.540179209 +0100
>> +++ /var/lib/ceph/{FSID}/prometheus.host1/etc/prometheus/alerting/ceph_alerts.yml.dist 2024-12-17 10:03:00.380883413 +0100
>> @@ -237,13 +237,13 @@
>> type: "ceph_default"
>> - alert: "CephPGImbalance"
>> annotations:
>> - description: "OSD {{ $labels.ceph_daemon }} on {{ $labels.hostname }} deviates by more than 50% from average PG count."
>> + description: "OSD {{ $labels.ceph_daemon }} on {{ $labels.hostname }} deviates by more than 30% from average PG count."
>> summary: "PGs are not balanced across OSDs"
>> expr: |
>> abs(
>> ((ceph_osd_numpg > 0) - on (job) group_left avg(ceph_osd_numpg > 0) by (job)) /
>> on (job) group_left avg(ceph_osd_numpg > 0) by (job)
>> - ) * on (ceph_daemon) group_left(hostname) ceph_osd_metadata > 0.50
>> + ) * on (ceph_daemon) group_left(hostname) ceph_osd_metadata > 0.30
>> ---snip---
>>
>> Then you restart prometheus ('ceph orch ps --daemon-type prometheus' shows you the exact daemon name):
>>
>> ceph orch daemon restart prometheus.host1
>>
>> This will only work until you upgrade prometheus, of course.
>>
>> Regards,
>> Eugen
>>
>>
>> Zitat von "Devin A. Bougie" <devin.bougie@xxxxxxxxxxx>:
>>
>> Thanks, Eugen. I’m afraid I haven’t yet found a way to either disable the CephPGImbalance alert or change it to handle different OSD sizes. Changing /var/lib/ceph/<cluster_id>/home/ceph_default_alerts.yml doesn’t seem to have any effect, and I haven’t even managed to change the behavior from within the running prometheus container.
>>
>> If you have a functioning workaround, can you give a little more detail on exactly what yaml file you’re changing and where?
>>
>> Thanks again,
>> Devin
>>
>> On Dec 30, 2024, at 12:39 PM, Eugen Block <eblock@xxxxxx> wrote:
>>
>> Funny, I wanted to take a look next week how to deal with different OSD sizes or if somebody already has a fix for that. My workaround is changing the yaml file for Prometheus as well.
>>
>> Zitat von "Devin A. Bougie" <devin.bougie@xxxxxxxxxxx>:
>>
>> Hi, All. We are using cephadm to manage a 19.2.0 cluster on fully-updated AlmaLinux 9 hosts, and would greatly appreciate help modifying or overriding the alert rules in ceph_default_alerts.yml. Is the best option to simply update the /var/lib/ceph/<cluster_id>/home/ceph_default_alerts.yml file?
>>
>> In particular, we’d like to either disable the CephPGImbalance alert or change it to calculate averages per-pool or per-crush_rule instead of globally as in [1].
>>
>> We currently have PG autoscaling enabled, and have two separate crush_rules (one with large spinning disks, one with much smaller nvme drives). Although I don’t believe it causes any technical issues with our configuration, our dashboard is full of CephPGImbalance alerts that would be nice to clean up without having to create periodic silences.
>>
>> Any help or suggestions would be greatly appreciated.
>>
>> Many thanks,
>> Devin
>>
>> [1] https://nam12.safelinks.protection.outlook.com/?url="">
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx<mailto:ceph-users-leave@xxxxxxx>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx