Re: Issue with Recovery Throughput Not Visible in Ceph Dashboard After Upgrade to 19.2.0 (Squid)

Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> · Wed, 23 Oct 2024 04:58:06 +0200 (CEST)

Hi everyone,

I can confirm that both ceph-exporter and node-exporter are deployed in the non-upgraded v19 test cluster I have where recovery iops are visible in Ceph Dashboard.

I find it odd that the dashboard requires an additional service to be deployed to display a single metric when all other metrics are displayed properly without it. Might be a transitionnal state with ceph-exporter being replaced by node-exporter and Prometheus configuration not being entirely updated yet. Just a guess, I haven't checked the tracker nor have I checked if node-exporter exports the recovery iops or not.

Regards,
Frédéric.

________________________________
De : Eugen Block <eblock@xxxxxx>
Envoyé : mardi 22 octobre 2024 11:20
À : ceph-users@xxxxxxx
Objet :  Re: Issue with Recovery Throughput Not Visible in Ceph Dashboard After Upgrade to 19.2.0 (Squid)

You're right, deploying ceph-exporter fixes the recovery graph in my  
lab cluster. I didn't have to redeploy prometheus. 

Thanks! 

Zitat von mailing-lists <mailing-lists@xxxxxxxxx>: 

> Hey there, 
> 
> i've that problem too, although I got it from updating 17.2.7 to  
> 18.2.4. After i read this mail I've just fiddled around a bit aaand  
> Prometheus does not have ceph_osd_recovery_ops. 
> 
> Then i looked into the files in  
> /var/lib/ceph/xyz/prometheus.node-name/etc/prometheus/prometheus.yml  
> and there i found 
> 
>   - job_name: 'ceph-exporter' 
>     honor_labels: true 
>     http_sd_configs: 
>     - url:  
> http://132.252.90.151:8765/sd/prometheus/sd-config?service=ceph-exporter 
> 
> There are no ceph-exporter services running on my cluster, only  
> node-exporters. So i started ceph-exporter services aaand there we  
> can find ceph_osd_recovery_ops. 
> 
> I guess they got lost while upgrading? I dont know, but this seems  
> to be the solution. 
> 
> 
> ceph orch apply ceph-exporter 
> 
> ceph orch redeploy prometheus 
> 
> 
> Best 
> 
> inDane 
> 
> 
> 
> 
> 
> 
> 
> On 21.10.24 13:59, Frédéric Nass wrote: 
>> This could be due to a typo in the panel definition in Grafana  
>> (comparing the JSON of the working panel with the non-working one  
>> might provide more insights) or because the Prometheus Datasource  
>> used by Grafana isn't providing any metrics for  
>> ceph_osd_recovery_ops. 
>> 
>> To check the panel in Grafana, you can go to  
>> https://<manager_ip>:3000 then Dashboard / Ceph - Cluster /  
>> Recovery Rate / Inspect JSON Panel. 
>> To verify if Prometheus provide some metrics, you can go to  
>> https://<manager_ip>:9095 and request metrics for  
>> ceph_osd_recovery_ops. 
>> 
>> Check [1] if you need to set the Grafana password. 
>> 
>> Regards, 
>> Frédéric 
>> 
>> [1]https://www.ibm.com/docs/en/storage-ceph/6?topic=access-setting-admin-user-password-grafana ----- Le 21 Oct 24, à 13:09, Sanjay Mohan<sanjaymohan@xxxxxxxxxxxxx> a écrit  
>> : 
>> 
>>> Hi Frédéric, 
>>> Thank you for the response. 
>>> I tried disabling and re-enabling the module, and it seems that  
>>> the recovery 
>>> metrics are indeed being collected, but still not displayed on the new 
>>> dashboard. Interestingly, I have another environment running Ceph  
>>> 19.2.0 where 
>>> the recovery throughput is displayed correctly, but I’m unable to  
>>> identify any 
>>> major differences between the two setups. 
>>> Do you have any additional suggestions for troubleshooting this  
>>> issue further? 
>>> Thanks again for your help! 
>>> Best regards, 
>>> Sanjay Mohan 
>>> Software Defined Storage Engineer 
>>> sanjaymohan@xxxxxxxxxxxxx 
>>> From: Frédéric Nass<frederic.nass@xxxxxxxxxxxxxxxx> 
>>> Sent: 21 October 2024 1:22 PM 
>>> To: Sanjay Mohan<sanjaymohan@xxxxxxxxxxxxx> 
>>> Cc: ceph-users<ceph-users@xxxxxxx> 
>>> Subject: Re:  Issue with Recovery Throughput Not  
>>> Visible in Ceph 
>>> Dashboard After Upgrade to 19.2.0 (Squid) 
>>> Hi Sanjay, 
>>> I've just checked the dashboard of a v19.2.0 cluster, and the recovery 
>>> throughput is displayed correctly, as shown in the screenshot here [1]. You 
>>> might want to consider redeploying the dashboard. 
>>> Regards, 
>>> Frédéric. 
>>> [1] [https://docs.ceph.com/en/latest/mgr/dashboard/ | 
>>> https://docs.ceph.com/en/latest/mgr/dashboard/ ] 
>>> ----- Le 19 Oct 24, à 19:23, Sanjay Mohansanjaymohan@xxxxxxxxxxxxx  
>>> a écrit : 
>>>> Dear Ceph Users, 
>>>> I hope this message finds you well. 
>>>> I recently performed an upgrade of my Ceph cluster, moving  
>>>> through the following 
>>>> versions: 
>>>> * 17.2.7 >> 18.2.0 >> 18.2.2 >> 18.2.4 >> 19.2.0 (Squid) 
>>>> After successfully upgrading to Ceph 19.2.0, I noticed an issue where the 
>>>> recovery throughput is no longer visible in the new Ceph  
>>>> dashboard. However, 
>>>> Old dashboard metrics and features seem to be working as expected. It is 
>>>> important to note that the recovery throughput was displayed  
>>>> properly in the 
>>>> previous version of the Ceph dashboard. I am using Cephadm for the 
>>>> installation, not Rook. 
>>>> Current behavior: 
>>>> * Recovery throughput metrics are not displayed in the new dashboard after 
>>>> upgrading to 19.2.0. 
>>>> Expected behavior: 
>>>> * The recovery throughput should be visible, as it was in  
>>>> previous versions of 
>>>> the Ceph dashboard. 
>>>> I am reaching out to inquire if there are any known issues,  
>>>> workarounds, or 
>>>> upcoming fixes for this. Your assistance in this matter would be greatly 
>>>> appreciated. 
>>>> Thank you for your time and support. I look forward to hearing  
>>>> from you soon. 
>>>> Best regards, 
>>>> Sanjay Mohan 
>>>> Software Defined Storage Engineer 
>>>> sanjaymohan@xxxxxxxxxxxxx 
>>>> [Amrita University] 
>>>> Disclaimer: The information transmitted in this email, including  
>>>> attachments, is 
>>>> intended only for the person(s) or entity to which it is addressed and may 
>>>> contain confidential and/or privileged material. Any review,  
>>>> retransmission, 
>>>> dissemination or other use of, or taking of any action in  
>>>> reliance upon this 
>>>> information by persons or entities other than the intended recipient is 
>>>> prohibited. Any views expressed in any message are those of the individual 
>>>> sender and may not necessarily reflect the views of Amrita Vishwa  
>>>> Vidyapeetham. 
>>>> If you received this in error, please contact the sender and  
>>>> destroy any copies 
>>>> of this information. 
>>>> _______________________________________________ 
>>>> ceph-users mailing list --ceph-users@xxxxxxx 
>>>> To unsubscribe send an email toceph-users-leave@xxxxxxx 
>>> Disclaimer: The information transmitted in this email, including  
>>> attachments, is 
>>> intended only for the person(s) or entity to which it is addressed and may 
>>> contain confidential and/or privileged material. Any review,  
>>> retransmission, 
>>> dissemination or other use of, or taking of any action in reliance  
>>> upon this 
>>> information by persons or entities other than the intended recipient is 
>>> prohibited. Any views expressed in any message are those of the individual 
>>> sender and may not necessarily reflect the views of Amrita Vishwa  
>>> Vidyapeetham. 
>>> If you received this in error, please contact the sender and  
>>> destroy any copies 
>>> of this information. 
>> _______________________________________________ 
>> ceph-users mailing list --ceph-users@xxxxxxx 
>> To unsubscribe send an email toceph-users-leave@xxxxxxx 
> _______________________________________________ 
> ceph-users mailing list -- ceph-users@xxxxxxx 
> To unsubscribe send an email to ceph-users-leave@xxxxxxx 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx