Hi,
After upgrading from 15.2.8. to 15.2.13 with cephadm on CentOS 8
(containerised installation done by cephadm), Grafana no longer shows
new data. Additionally, when accessing the Dashboard-URL on a host
currently not hosting the dashboard, I am redirected to a wrong hostname
(as shown in ceph mgr services).
I assume that this is caused by the same reason which leads to this
output of `ceph mgr services`:
{
"dashboard": "https://ceph-<cluster-id>-mgr.iceph-11.tsmsqs:8443/",
"prometheus": "http://ceph-<cluster-id>-mgr.iceph-11.tsmsqs:9283/"
}
The correct hostname is iceph-11 (without the tsmsqs part), FQDN is
iceph-11.servernet. The hosts use DNS, the names (iceph-11 and
iceph-11.servernet) are resolvable both from the hosts as well as from
within the Podman containers.
I have determined that podman by default sets the container name as a
hostname alias (visible with `hostname -a` within the container), which
somehow leads to Ceph mgr picking it up as the primary name?
My workaround is to modify
/var/lib/ceph/<cluster-id>/mgr.<hostname>.<random-6-char-string>/unit.run,
adding --no-hosts as an additional argument to the "podman run" command.
I could probably use a system-wide containers.conf as well.
With this workaround and after restarting the Ceph mgr container (via
systemctl) and then restarting Prometheus and Grafana (with ceph orch
redeploy), I once again get data in Grafana and the correct redirect for
the dashboard. `ceph mgr services` also shows expected and correct values.
I am wondering if this kind of issue is known or whether there is
something wrong with my setup. I expected Ceph mgr to use the primary
hostname and not some seemingly random hostname alias. Maybe this issue
can also be discussed in a troubleshooting section of the monitoring
stack documentation.
Cheers
Sebastian
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx