Hi,
did you also mute the osd_unreachable warning?
ceph health mute OSD_UNREACHABLE 10w
Should bring the cluster back to HEALTH_OK for 10 weeks.
Zitat von Harry G Coin <hgcoin@xxxxxxxxx>:
Hi Nizam
Answers interposed below.
On 2/10/25 11:56, Nizamudeen A wrote:
Hey Harry,
Do you see that for every alert or for some of them? If some, what
are those? I just tried a couple of them locally and saw the
dashboard went to a happy state.
My sanbox/dev array has three chronic 'warnings/errors'. The first
is a PG imbalance I'm aware of. The second is that all 27 osds are
unreachable. The third is that the array has been in an error state
for more than 5 minutes. Silencing/suppressing all of them still
gives the 'red flashing broken dot' on the dashboard, the !Cluster
status, notice of Alerts listing the previously suppressed
errors/warnings. Under 'observability' we see no indications of
errors/warnings under the 'alerts' menu option -- so you got that
one right.
Can you tell me how the ceph health or ceph health detail looks
like after the muted alert? And also does ceph -s reports HEALTH_OK?
root@noc1:~# ceph -s
cluster:
id: 40671....140f8
health: HEALTH_ERR
27 osds(s) are not reachable
services:
mon: 5 daemons, quorum noc4,noc2,noc1,noc3,sysmon1 (age 10m)
mgr: noc1.jxxxx(active, since 37m), standbys: noc2.yhxxxxx,
noc3.xxxxb, noc4.txxxxc
mds: 1/1 daemons up, 3 standby
osd: 27 osds: 27 up (since 14m), 27 in (since 5w)
Ceph's actual core operations are otherwise normal.
It's hard to sell ceph as a concept when showing all the storage is
at once unreachable and up and in as well. Not a big confidence
builder.
Regards,
Nizam
On Mon, Feb 10, 2025 at 9:00 PM Harry G Coin <hgcoin@xxxxxxxxx> wrote:
In the same code area: If all the alerts are silenced,
nevertheless the
dashboard will not show 'green', but red or yellow depending on the
nature of the silenced alerts.
On 2/10/25 04:18, Nizamudeen A wrote:
> Thank you Chris,
>
> I was able to reproduce this. We will look into it and send out
a fix.
>
> Regards,
> Nizam
>
> On Fri, Feb 7, 2025 at 10:35 PM Chris Palmer
<chris.palmer@xxxxxxxxx> wrote:
>
>> Firstly thank you so much for the 19.2.1 release. Initial testing
>> suggests that the blockers that we had in 19.2.0 have all been
resolved,
>> so we are proceeding with further testing.
>>
>> We have noticed one small problem in 19.2.1 that was not present in
>> 19.2.0 though. We use the older-style dashboard
>> (mgr/dashboard/FEATURE_TOGGLE_DASHBOARD false). The problem
happens on
>> the Dashboard screen when health changes to WARN. If you click
on WARN
>> you get a small empty dropdown instead of the list of warnings. A
>> javascript error is logged, and using browser inspection there
is the
>> additional bit of info that it happens in polyfill:
>>
>> 2025-02-07T15:59:00.970+0000 7f1d63877640 0 [dashboard ERROR
>> frontend.error] (https://<redacted>:8443/#/dashboard): NG0901
>> Error: NG0901
>> at d.find (https://
>> <redacted>:8443/main.7869bccdd1b73f3c.js:3:3342365)
>> at le.ngDoCheck
>> (https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3173112)
>> at Qe
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3225586)
>> at bt
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3225341)
>> at cs
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3225051)
>> at $m
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3259043)
>> at jf
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3266563)
>> at S1
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3259790)
>> at $m
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3259801)
>> at fg
(https://<redacted>:8443/main.7869bccdd1b73f3c.js:3:3267248)
>>
>> Also, after this happens, no dropdowns work again until the page is
>> forcibly refreshed.
>>
>> Environment is RPM install on Centos 9 Stream.
>>
>> I've created issue [0].
>>
>> Thanks, Chris
>>
>> [0] https://tracker.ceph.com/issues/69867
>> <
>>
https://tracker.ceph.com/issues/69867?next_issue_id=69865&prev_issue_id=90
<https://tracker.ceph.com/issues/69867?next_issue_id=69865&prev_issue_id=90>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx