Re: Node Exporter keep failing while upgrading cluster in Air-gapped ( isolated environment ).

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Adam,

Thanks for the prompt response. 
We have below image in private-registry for node-exporter.

192.168.1.10:5000/prometheus/node-exporter           v1.5.0     0da6a335fe13   19 months ago   22.5MB

But upon ceph upgrade, we are getting the mentioned image ( quay.io/prometheus/node-exporter:v1.5.0) by executing following command.
root@node-01:~# ceph config get mgr mgr/cephadm/container_image_node_exporter
quay.io/prometheus/node-exporter:v1.5.0

I am a bit reluctant which image do i need to work upon, the image that is there on private-registry (192.168.1.10:5000/prometheus/node-exporter ) or the image that we are getting after executing "ceph config" command i.e  ( quay.io/prometheus/node-exporter:v1.5.0 ).



For every ( monitoring stack components ) images do we required Internet no matter where the image is present even if it is there on private-registry. Right ? 

We executed following command but it is in error state.
root@node-01:~# ceph config get mgr mgr/cephadm/container_image_node_exporter 192.168.1.10:5000/prometheus/node-exporter:v1.5.0

Sharing some important outputs:
root@node-01:~# ceph health detail
HEALTH_WARN 1 failed cephadm daemon(s)
[WRN] CEPHADM_FAILED_DAEMON: 1 failed cephadm daemon(s)
    daemon node-exporter.sky-blue on sky-blue is in error state

Journalctl logs:

Jul 16 12:01:18 sky-blue systemd[1]: ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter. sky-blue.service: Failed with result 'exit-code'.
Jul 16 12:01:18  sky-blue systemd[1]: Failed to start Ceph node-exporter. sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
Jul 16 12:01:28  sky-blue systemd[1]: ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter. sky-blue.service: Scheduled restart job, restart counter is at 4.
Jul 16 12:01:28  sky-blue systemd[1]: Stopped Ceph node-exporter. sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
Jul 16 12:01:28  sky-blue systemd[1]: Starting Ceph node-exporter. sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774...
Jul 16 12:01:29  sky-blue bash[115638]: Trying to pull quay.io/prometheus/node-exporter:v1.5.0...
Jul 16 12:02:29  sky-blue bash[115638]: Error: initializing source docker://quay.io/prometheus/node-exporter:v1.5.0: pinging container registry quay.io: Get "https://quay.io/v2/": dial tcp 3.214.40.167:443: i/o timeout
Jul 16 12:02:29  sky-blue systemd[1]: ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter. sky-blue.service: Control process exited, code=exited, status=125/n/a
Jul 16 12:02:29  sky-blue systemd[1]: ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter. sky-blue.service: Failed with result 'exit-code'.
Jul 16 12:02:29  sky-blue systemd[1]: Failed to start Ceph node-exporter. sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
Jul 16 12:02:39  sky-bluey systemd[1]: ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter. sky-blue.service: Scheduled restart job, restart counter is at 5.
Jul 16 12:02:39  sky-blue systemd[1]: Stopped Ceph node-exporter. sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.
Jul 16 12:02:39  sky-blue systemd[1]: ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter. sky-blue.service: Start request repeated too quickly.
Jul 16 12:02:39  sky-blue systemd[1]: ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@node-exporter. sky-blue.service: Failed with result 'exit-code'.
Jul 16 12:02:39  sky-bluey systemd[1]: Failed to start Ceph node-exporter. sky-blue for 3d444cb6-435b-11ef-bdc9-f30dc5fa2774.

It is attempting to start the following image:
quay.io/prometheus/node-exporter:v1.5.0

root@sky-blue:/var/lib/ceph/3d444cb6-435b-11ef-bdc9-f30dc5fa2774/node-exporter.sky-blue# cat unit.run
set -e
# node-exporter.sky-blue
! /usr/bin/podman rm -f ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter.sky-blue 2> /dev/null
! /usr/bin/podman rm -f ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter-sky-blue 2> /dev/null
! /usr/bin/podman rm -f --storage ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter-sky-blue 2> /dev/null
! /usr/bin/podman rm -f --storage ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter.sky-blue 2> /dev/null
/usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --init --name ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774-node-exporter-sky-blue --user 65534 --security-opt label=disable -d --log-driver journald --conmon-pidfile /run/ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@xxxxxxxxxxxxxxxxx-blue.service-pid --cidfile /run/ceph-3d444cb6-435b-11ef-bdc9-f30dc5fa2774@xxxxxxxxxxxxxxxxx-blue.service-cid --cgroups=split -e CONTAINER_IMAGE=quay.io/prometheus/node-exporter:v1.5.0 -e NODE_NAME=ocean-bay -e CEPH_USE_RANDOM_NONCE=1 -v /var/lib/ceph/3d444cb6-435b-11ef-bdc9-f30dc5fa2774/node-exporter.sky-blue/etc/node-exporter:/etc/node-exporter:Z -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/rootfs:ro -v /etc/hosts:/etc/hosts:ro quay.io/prometheus/node-exporter:v1.5.0 --no-collector.timex --web.listen-address=:9100 --path.procfs=/host/proc --path.sysfs=/host/sys --path.rootfs=/rootfs
 

Please guide how to resolve this.

Regards,
Mohammad Saif
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux