Hi Frédéric,
Another half year added to the previous half year wait for basic IP6
clusters then. If only 'ceph health mute' accomplished the goal as a
workaround. Notice even when all complaints are 'suppressed' -- the
dashboard continues to offer the 'flashing red warning dot', and the !
Cluster critical advice.
I think that bug has two levels, first: even when other warnings/errors
are suppressed, the error that complains of being in a heath error for
more than 5 minutes remains. Second, even when the 'things have been
bad for 5 minutes' warning is also silenced, the ! Critical advice
remains and the flashing red 'ceph is broken' dot. This while under
'observability' the Alerts shows all is well.
Ceph is good in the engine room, but the steering wheel and dashboard
needs some work to match the advertising and quality of the rest!
Harry
On 2/7/25 16:24, Frédéric Nass wrote:
Hi Harry,
It's a inoffensive bug [1] related to IPv6 clusters. It will be fixed
in v19.2.2. The workaround is to mute the error with 'ceph health mute
...'. It's all you can do for now.
Regards,
.
------------------------------------------------------------------------
*De :* Harry G Coin <hgcoin@xxxxxxxxx>
*Envoyé :* vendredi 7 février 2025 22:52
*À :* ceph-users
*Objet :* 19.2.1: HEALTH_ERR 27 osds(s) are not
reachable. (Yet working normally...)
19.2.1 complains of all osd's being unreachable, as their public address
isn't in the public subnet. However, they all are within the subnet,
and are working normally as well.
It's embarrassing for the dashboard to glow red of a totally crippled
osd roster --- while all is working normally. This existed in the
previous, but was working prior to 19.
Detail:
Notice, for osd.0, the dashboard lists
public_addr
[fc00:1002:c7::44]:6807/4160993080
But, we have in the logs:
7/2/25 03:35 PM[ERR] osd.0's public address is not in
'fc00:1002:c7::/64' subnet
7/2/25 03:35 PM[ERR][ERR] OSD_UNREACHABLE: 27 osds(s) are not reachable
7/2/25 03:35 PM[ERR]Health detail: HEALTH_ERR 27 osds(s) are not
reachable
However, as per the osd.0 attributes, the public address for osd.0 is
well inside the stated public subnet.
All the osd's are similarly configured, working, and held to be
unreachable at the same time, for the same reason.
Tell me there's a way to fix this without waiting a further half year....
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx